0

I'm using regexp to find and replace a variable in python expression in string format. I don't want the 'var' replaced to be the part of another function or variable name. (I banned the solution using if 'var' in expr and expr.replace("var", etc.)).

So, I check the previous characters (allowed) and the following characters(allowed) with the following regexp:

pattern = re.compile(r'(^var)(?=\+|\-|\*|\/| |$)|(?<=\+|\=|\[|\-|\*|\/|,| |\()var(?=\+|\-|\*|\/|$| |,|\))')

ô_O, it seems to be complicated but it works on the following test, replacing 'var' by '###'

expr     = ' var + variable + avar + var[x] + fun(var,variable) + fun2(variable, var, var1) + fun3(variable,var)+ var  -var/var+var*var*(var)'
expected = ' ### + variable + avar + var[x] + fun(###,variable) + fun2(variable, ###, var1) + fun3(variable,###)+ ###  -###/###+###*###*(###)'

I use regexp:

  • to ckeck if 'var' is in expression
  • to replace 'var' in expression
  • to check if 'var' is not in expression after being replaced.
if pattern.search(expr):
  new_expr = re.sub(pattern, '###', expr)  
  assert not pattern.search(new_expr), 'Replace failed'

I use the code a lot of time and I'm wondering if something simpler/faster exists ?

2
  • 1
    regex is not your tool of choice for parsing programming languages. Look into the ast module. Commented Oct 21, 2013 at 13:26
  • It seems odd that you are not replacing var in var[x] - is that intentional, or a typo? Commented Oct 21, 2013 at 13:31

1 Answer 1

1

Well, the pattern you need is: r'\bvar\b', the \b is "border" which lets us define the full "string" we want to replace without replacing things like "variable"

However, upon testing your "expected" string, I found it had a mistake in it:

expected = ' ### + variable + avar + var[x] ' # <- this last 'var' should be ###

Anyway. Solution:

>>> import re
>>> re.sub(r'\bvar\b', '###', expr)
' ### + variable + avar + ###[x] + fun(###,variable) + fun2(variable, ###, var1) + fun3(variable,###)+ ###  -###/###+###*###*(###)'
Sign up to request clarification or add additional context in comments.

5 Comments

it's not a mistake, I don't want the var to be a dict. It's looks like to be a border with exception.
I am not sure what you are saying, but valid code will not handle two variables with the same name in a namespace. var is var is var.
do I wrote a complex regexp to match a dump expression ? T_T. I'll check the code, I'll give you the answer.
Well, you're right, it's not a real situation. A better way to tests regexp is to use lists of expressions and lists of expected ones. The solution works with '\b'.
I wrote this sample to do not match the following situation : avar[x] => a###[x], even var[x] but var[x] will never exist in my code. '\b' is enough. thx.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.