2

I have to replace text with text which was found. Smth like this:

regex = u'barbar'
oldstring = u'BarBaR barbarian BarbaRONt'
pattern = re.compile(regex, re.UNICODE | re.DOTALL | re.IGNORECASE)
newstring = pattern.sub(.....)
print(newstring) # And here is what I want to see
>>> u'TEXT1BarBaRTEXT2 TEXT1barbarTEXT2ian TEXT1BarbaRTEXT2ONt'

So I want to receive my original text, where each word that matches 'barbar' (with ignored case) will be surrounded by two words, TEXT1 and TEXT2. Return value must be a unicode string. How can I realize it? Thanks!

1 Answer 1

7

You can use capturing group for that:

regex = u'(barbar)'
...
pattern.sub('TEXT1\\1TEXT2', oldstring)
# => u'TEXT1BarBaRTEXT2 TEXT1barbarTEXT2ian TEXT1BarbaRTEXT2ONt'

Taking barbar into parenthesis makes regexp to capture every part of the string that matches this part of the regexp into a group. As it's the first (and the only one) capturing group you can refer to it as \1 anywhere in the replacement string or in the regexp itself.

For more explanation see (...) and \number sections in the docs.

Btw, if you don't like escaping of the slash before group number you can use raw string instead:

pattern.sub(r'TEXT1\1TEXT2', oldstring)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.