0

I'm writing a regex to find code like this:

if condition:True:
    statements

Where the first colon is instead of ==. I've come up with this so far: r"(if|elif) .* : .* :\n\t"
But I want to find a way to select the first : and replace it with ==. The thing is, I cant find a way to use regex to substitute : without inadvertantly a: replacing the whole if condition etc with == by writing a

re.sub(r"(if|elif) .* : .* :\n\t","==",text)

or b: Replacing every colon in my script with ==, which will cause an Error like

NameError: Expected a ':', received '=='

So, is there a way to substitute only a bit of the regex with ==, or is there another way to do this that I overlooked??
What should happen
Input:

if 3+4:7:
     print("Three plus four is actually seven")
elif 3+6:10:
    print("3+4 is not 7, yet 3+6=10")


Console:
Three plus four is actually seven

This is Python 3 by the way.

1 Answer 1

3

Try matching everything from if or elif upto last-but one : into a match-group and similarly everything after that to the last : into another group. Now you can substitute the first match followed by == and then the next match

re.sub('((?:if|elif)[^:]*):([^:]*:)', r'\1 == \2', s)
Sign up to request clarification or add additional context in comments.

11 Comments

Only improvement I can think of is to do [^:]* instead of .*, so it doesn't match more colons than it should in case some line should happen to have 3 (for some other odd reason). Generally one should try to avoid using .* if it's possible.
@melwil. I thought of doing it, but then it would wrongly stop on the colon incase the expression contains colons
If the expression contains more colons than the example, there's no guarantee this regex would substitute that correctly anyway. My point is that this regex can actually ignore what little pattern you actually have to match on. Using the negated expression brings in another interesting thing with the fact that it matches newlines, though, so you'd need to turn on multiline and match the end of line with $. Your original regex would probably do just fine.
re.sub('((?:if|elif)[^:]*):([^:]*:)', '\\1 == \\2', text) the last : must be inside the group
No need for the anchor and flag unless you use the negated colon [^:] group instead of the dot. . does not match newlines without the dotall (s) flag.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.