How to use a pandas dataframe field to regex-replace text in another field in Python?

Question

I'd like to find text in one field of a pandas dataframe ("text") based on another field ("words") of it.

#import re
import pandas as pd
df = pd.DataFrame([['I like apple pie','apple'],['Nice banana and lemon','banana|lemon']], columns=['text','words'])
df['text'] = df['text'].str.replace(r''+df['words'].str, '*'+group(0)+'*')
df

I'd like to mark the found words with *.
How can I do that?

The desired output is:
I like *apple* pie
Nice *banana* and *lemon*

Paolo · Accepted Answer · 2018-08-22 13:06:56Z

1

You could capture the word from words and use backreference in the substitution to wrap it in *:

import re
import pandas as pd
df = pd.DataFrame([['I like apple pie','apple'],['Nice banana and     lemon','banana|lemon']], columns=['text','words'])

df['text'] = df['text'].replace(r'('+df['words']+')', r'*\1*', regex=True)
print(df)

Prints:

                            text         words
0             I like *apple* pie         apple
1  Nice *banana* and     *lemon*  banana|lemon

answered Aug 22, 2018 at 13:06

Paolo

26.6k8 gold badges51 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

lmocsi Over a year ago

That's the syntax, I was looking for! Thanks.

BENY · Accepted Answer · 2018-08-22 13:04:09Z

1

IIUC using (?i) is same as re.I

df.text.replace(regex=r'(?i)'+ df.words,value="*")
Out[131]: 
0        I like * pie
1    Nice * and     *
Name: text, dtype: object

Since you update the question

df.words=df.words.str.split('|')
s=df.words.apply(pd.Series).stack()
df.text.replace(dict(zip(s,'*'+s+'*')),regex=True)
Out[139]: 
0               I like *apple* pie
1    Nice *banana* and     *lemon*
Name: text, dtype: object

edited Aug 22, 2018 at 13:04

answered Aug 22, 2018 at 12:49

BENY

324k22 gold badges176 silver badges250 bronze badges

1 Comment

lmocsi Over a year ago

I'd like to wrap the found words with *-s. I posted the desired result in the original question.

Collectives™ on Stack Overflow

How to use a pandas dataframe field to regex-replace text in another field in Python?

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related