0

I would like to search and replace a block of text which contains new line characters.

In the example below when the DOTALL flag is specified, findall behaves as expected and '.' matches any character including a newline. But when calling sub, the DOTALL flag doesn't seem to do anything and no matches are found. I just want to confirm that I can't use '.' with sub to replace text that contains new line characters or if I'm not calling the function correctly.

Code

import re
text = """
some example text...
START
bla bla
bla bla
END
"""
print 'this works:', re.findall('START.*END', text, re.DOTALL)
print 'this fails:', re.sub('START.*END', 'NEWTEXT', text, re.DOTALL)

Output

this works: ['START\nbla bla\nbla bla\nEND']
this fails:
some example text...
START
bla bla
bla bla
END

1 Answer 1

1

I'm not exactly sure why, but you have to specify flags= in re.sub (the docs uses it).

print 'this works:', re.sub('START.*END', 'NEWTEXT', text, flags=re.DOTALL)

It might be because of the optional count argument.

EDIT:

I think that's because of the count argument after all, since this works as well:

print 'this works:', re.sub('START.*END', 'NEWTEXT', text, 0, re.DOTALL)

0 meaning replacing all.

Sign up to request clarification or add additional context in comments.

2 Comments

print 'this works:', re.sub('START.*END', 'NEWTEXT', text, 0, re.DOTALL) #also works
@MarwanAlsabbagh Yes, I was just testing this over

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.