Efficient way to search string contains in multiple columns using pandas [duplicate]

Question

I have a pandas dataframe like as shown below

import pandas as pd
import numpy as np
df=pd.DataFrame({'Adm DateTime':['02/25/2012','03/05/1996','11/12/2010','31/05/2012','21/07/2019','31/10/2020'],
                 's_id':[1,2,3,4,5,6],
                'test_string_1':['test','Thalaivar','Superstar','God','Favorite','Rajinikanth'],
                'test_string_2':['Rajinikanth','God of Cinema','Favorite','Superstar','Rahman','ARR']})
df['Adm DateTime'] = pd.to_datetime(df['Adm DateTime'])

I would like to check whether a substring is present in any of the columns (test_string_1 and test_string_2)

Though I am able to do for one column like as shown below

df['op_flag'] = np.where(df['test_string_1'].str.contains('Rajini|God|Thalaivar',case=False),1, 0)

Can you help me with how can we do this across both the columns?

Should I repeat the above code with a different column name?

Is there any way to provide the column names that I would like to check for in the code?

Asish M. · Accepted Answer · 2021-01-03 10:00:49Z

2

You can do this with a lambda function

In [40]: df[['test_string_1', 'test_string_2']].apply(lambda x: x.str.contains('Rajini|God|Thalaivar',case=False)).any(axis=1).astype(int)
Out[40]:
0    1
1    1
2    0
3    1
4    0
5    1
dtype: int64

answered Jan 3, 2021 at 10:00

Asish M.

2,6571 gold badge19 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Efficient way to search string contains in multiple columns using pandas [duplicate]

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related