3

I have text looks like:

Added "a-foo-b" foo.

The goal is to use a regular expression to replace the second foo to bar, and leave the first foo between two double quotes untouched. So in the text above, the result I am looking for is:

Added "a-foo-b" bar.

Thanks

2
  • re.sub(r'foo', 'bar', s) # This will replace all foo to bar in text Commented Jun 29, 2012 at 8:15
  • Can you rely on the full stop terminator being there? Commented Jun 29, 2012 at 8:16

3 Answers 3

3
import re

pat = re.compile(r'("[^"]+".*)foo')

s = '''Added "a-foo-b" foo.'''

s_new = re.sub(pat, r'\1bar', s)
print(s_new)

Since you said the goal is to leave the one in double quotes alone, I focused on the double quotes as the key. The parentheses form a "match group" that saves the matched string; this match group matches the double-quotes and what is inside them, and then the pattern matches the second foo. The replacement pattern will replace everything we matched, but that's okay because we use a \1 to put back the match group part, and then we have bar to replace that second foo.

If you know there cannot be any more double-quotes after the foo you want to replace, this might be a better pattern:

pat = re.compile(r'(".*".*)foo')

This pattern matches a double-quote, then anything, then another double-quote. The first pattern won't work if the quoted string includes an escaped double-quote, but this one would. But if you use this pattern on this string:

s = '''Added "a-foo-b" foo.  "Wow, another foo"'''

The match group would match past the second foo and would match the third foo, even though it is in quotes. This is because the pattern match is "greedy".

EDIT:

Question: Yeah, what if s = '''Added "a-foo-b" foo.Deleted "a-foo-b".'''

Answer: If the pattern always holds, you know there won't be an escaped double-quote inside the double-quotes, and you can use the first pattern. Then you can apply multiple patterns to detect and/or replace whatever you want. pat_added below solves the problem we wanted to solve before; it anchors on the Added part of the string so it won't do anything to the Deleted part of the string. If you did want to match and replace part of the string inside the quotes, pat_deleted shows how to do it; it has three match groups, and puts back the first and last one to let you replace the middle one. Actually we don't really need a match group for the middle one; we could leave the part we are replacing outside a match group, like we did with the first pattern.

import re
pat_added = re.compile(r'(Added\s+"[^"]+"\s+)\w+')
pat_deleted = re.compile(r'(Deleted\s+"[a-z]-)([^-]+)(-[a-z]"\.)')

s = '''Added "a-foo-b" foo.Deleted "a-foo-b".'''
s = re.sub(pat_added, r'\1bar', s)
s = re.sub(pat_deleted, r'\1bar\3', s)
print(s)
Sign up to request clarification or add additional context in comments.

2 Comments

Not bad, but what if you want to skip all foo-s in quotes?
Yeah, what if s = '''Added "a-foo-b" foo.Deleted "a-foo-b".'''
0

If your text always ends with a dot, you can try something like:

echo 'Added "a-foo-b" foo.'  | sed s/foo\.$/bar/g

Added "a-foo-b" bar

2 Comments

This is not python :-) and let the OP show some effort before posting an answer...
Ok, I just thought that see the regex will help. :-)
0

An approach with string methods.

>>> s='Added "a-foo-b" foo test'
>>> needle='foo'
>>> rind=s.rfind('foo')
>>> if rind!=-1:
...  s=s[:rind] + needle + s[rind+len(needle):]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.