0

I am trying to replace a regular expression match with modified regular expression. Following is the column in my DataFrame.

    df['newcolumn']
    0    Ther was a quick brown appl_product_type in ("eds") where blah blan appl_Cust_type =("value","value")
    1    Ther was a quick brown appl_product_type = ("EDS") where blah blan appl_Cust_type =("value","value") 
    2    Ther was a quick brown appl_product_type in ("eds") where blah b                                     
    3    Ther was a quick brown appl_product_type in = ("EDS") where blah blan appl_Cust_type = ("value")     
    4    Ther was a quick brown  where blah blan appl_Cust_type                                               
    Name: newcolumn, dtype: object

i want to replace every occurrence of strings like "appl_product_type = ('EDS')' to 'upper(appl_product_type) = ('EDS')'

i am using following code but getting error

    newcolumn.replace(value='upper\[\w]+\s+[in=]+[\s+\([\"\w+\,+\s+]+\)', regex='[\w]+\s+[in=]+[\s+\([\"\w+\,+\s+]+\)')
    error: bad escape \w at position 7

is there a way to solve this ?? Please Help.

1
  • Why are you using \w in your replacement ? You should be using the group match instead Commented Feb 10, 2020 at 4:18

1 Answer 1

1

A couple of things -

  • you cant use \w in your replacement value and expect it to know what to fill in
  • your regex as is, is badly formatted. use r'' to make simpler regex strings
  • your question is unclear as you are asking one specific format while your regex is attempting to catch a lot more.

I have a slightly more clear solution to what you have attempted, but am unsure if this is exactly what you wanted given the ambiguity in you question. -

df['newcolumn'] = df['newcolumn'].replace({r'([\w_]+\s+(?:in|=|\s)+\(\"(?:\w+\"(?:\,)?(?:\s+)?)+\))' : r'upper(\1)'}, regex=True)
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you so very much for the answer.It worked as expected.It's just that i don't understand what \1 in the replacement key is doing.I have edited the question .I only want regex to match the entire string and replace with upper(appl_product_type) = ("eds")
@deepakkumar thats not an issue, in a regex replace you can use \1, \2 and so on as a substitute to that matching group in your regex. Since I surrounded the regex by () and that was the only matching group \1 acted as a substitute for the whole regex match. I used a number of non-matching groups (?:) to avoid any confusion with groups

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.