3

I am interested in removing all occurrences of a pattern in a Python string where the pattern looks like "start-string blah, blah, blah end-string". This is a general problem I'd like to be able to handle. This is the same problem as How can I remove a portion of text from a string whenever it starts with &*( and ends with )(* but in Python and not Java.

How would I solve the same problem in Python?

Assume the string looks like this,

'Bla bla bla <mark asd asd asd /> bla bla bla. Yadda yadda yadda <mark alls lkja /> yadda.'

The start of the block to remove is <mark and the end is />. So I do the following:

import re
mystring = "Bla bla bla <mark asd asd asd /> bla bla bla. Yadda yadda yadda <mark akls lkja /> yadda."
tags = "<mark", "/>"
re.sub('%s.*%s' % tags, '', mystring)

My desired output is

'Bla bla bla  bla bla bla. Yadda yadda yadda  yadda.'

But what I get is

'Bla bla bla  yadda.'

So clearly the command is using the first instance of the opening string and the last occurrence of the end string.

How do I make it match the pattern twice and give me the desired output? This has to be easy but despite searches on "remove multiple occurrences regex Python" and the like I have not found an answer. Thanks.

2
  • Possible duplicate of Python non-greedy regexes Commented Apr 11, 2019 at 22:41
  • @Robin: I agree. I actually got the definition of greedy backwards. Live and learn. Commented Apr 11, 2019 at 22:49

1 Answer 1

3

You basically want to find anything between '<mark' and '/>' so you start with the pattern

r'<mark .* />'

However the .* will be greedy, so to make it non-greedy you need to add a ?, then simply use re.sub to replace those matches with empty string

>>> re.sub(r'<mark .*? />', '', s)
'Bla bla bla  bla bla bla. Yadda yadda yadda  yadda.'
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.