Using a regex stored as a variable in Python [closed]

Question

Closed. This question needs debugging details. It is not currently accepting answers.

Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.

Closed 2 years ago.

Improve this question

I read regexes and their replacements from a CSV into a dictionary and then run that over a column in a Dataframe looking for locations:

for regex, replacement in regex_replace.items():

    df["A"] = df["a"].str.replace(regex, replacement)

This works fine and successfully replaces the text. An example regex would be:

(?i)\b(maine)

However, I also want to capture the text that has been replaced from the regex match. I've tried this:

def find_match(regex, x):
    j = re.findall(r'{0}'.format(regex), x)
    return ",".join(j)

df['matches'] = df['A'].apply(lambda x: find_match(regex,str(x)))

But that doesn't find any matches - I think it's because the backslash is escaped. If I declared the regex variable as a raw string in the code, then this would work:

regex = r'(?i)\b(maine)'

However, I can't do that as it's aready stored in a variable. Is there a way to do this?

Related answers are: regex re.search is not returning the match Python Regex in Variable

I don't see how the first version works correctly. First, you're missing a ] after df["a". But more importantly, you're assigning the result to a different column than the source. So each time through the loop it processes the original source column, discarding the replacements from the previous iterations. You need to assign back to the same column. — Barmar
– Barmar, Commented Aug 31, 2023 at 16:19
Please show an example of regex_replace and the dataframe. — Barmar
– Barmar, Commented Aug 31, 2023 at 16:24
Is the difference between df["A"] and df["a"] intentional? — Barmar
– Barmar, Commented Sep 1, 2023 at 14:40

Cow · Accepted Answer · 2023-09-01 11:34:26Z

-2

One can use f-string for that.

def find_match(regex, x):
    j = re.findall(rf'{regex}', x)
    return ",".join(j)

edited Sep 1, 2023 at 11:34

Cow

3,0706 gold badges23 silver badges45 bronze badges

answered Aug 31, 2023 at 18:02

LetzerWille

5,6965 gold badges26 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user2357112 Over a year ago

rf'{regex}' just evaluates to a string exactly equal to regex.

Barmar Over a year ago

@Tomp If this worked then you didn't actually have a problem in the first place.

Barmar Over a year ago

rf'{regex}' is also the same as r'{0}'.format(regex) in the OP's code. The r doesn't do anything in either case, since there are no escape sequences in the format string (it doesn't apply after substitution of the variable).

Collectives™ on Stack Overflow

Using a regex stored as a variable in Python [closed]

1 Answer 1

3 Comments

Linked

Hot Network Questions