2

I am trying to find && using regex and substitute it with and using Python. Here is my regex:

r"(?=( && ))"

test input: x&& &&& && && x || | ||\|| x, expected output: x&& &&& and and x or | ||\|| x. My Python code:

import re

input = "x&& &&& && && x || | ||\|| x"
result = re.sub(r"(?=( && ))", " and ", input)
print(result)

My output is: x&& &&& and && and && x || | ||\|| x. This actually works, but instead of substitution it leaves the original pattern just adds my substitution string when it finds pattern. This is really confusing.

3
  • re.sub(r"(?<!\S)&&(?!\S)", "and", text)? You only have a trailing space that overlaps. Commented Jun 7, 2020 at 20:43
  • overlap can be control by lookahead, but yuo can limit the overlap to a single end char i.e [ ]&&(?=[ ]) replace ` and` this way endchar can be reuse (overlap) adjacent ` &&` if needed. of course you can get fancy and im so great stuiff afnd do assertunz back forward and upside down, but that is up to you Commented Jun 7, 2020 at 20:53
  • hundreds of overlapping regex examples on this site, did look a little, yes ? Commented Jun 9, 2020 at 21:45

2 Answers 2

2

A capturing group inside a positive lookahead can be used to extract overlapping patterns. To replace them, you actually need to consume the text, lookarounds do not consume the text they match.

In this case, you only have an overlapping trailing space, so you might use either of the two approaches:

text = re.sub(r"( )&&(?= )", r"\1and", text)

Or, if you need to replace any && that is neither preceded nor followed with any whitespace char, use

text = re.sub(r"(?<!\S)&&(?!\S)", r"and", text)

Note that input is a Python builtin, you should name your text variable in a different way, say, text.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a lot for quick reply. Just to clarify: re.sub(r"( )&&(?= )" means && should be followed by whitespace in order to match our pattern, but we are not going to consume this whitespace when substitution is done. How r"\1and" works? I assume ' &&' needs to be substituted by "and" right?
@AmacOS ( ) is a capturing group, \1 backreference refers to this group value. If that is a regular space, you surely can just use text = re.sub(r" &&(?= )", r" and", text). I provided a backreference solution in case you want to use \s instead of a literal space to match any whitespace.
0

Your output is consistent with your statement of the problem (sic):

I am trying to find " && " and substitute it with " and "

Since the output is not what you want there must be a problem with your statement of the problem. I believe you actually want:

I am trying to find " &&" followed by a space and substitute it with " and"

Once you get that right it's just a matter of translating it to a regular expression:

import re

s = 'x&& &&& && && x || | ||\|| x'
print re.sub(r' &&(?= )', ' and', s)
  #=> x&& &&& and and x || | ||\|| x

Demo

(?= ) is a positive lookahead that asserts that the match is followed by a space.

1 Comment

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.