How to filter pandas rows based on if column values are substring of a given string?

Question

I have a dataframe df with a column A. There is a given string s.

I want to subset the dataframe with rows which have column values in A which are substrings of given string s. If I wanted the other way around, I would have done something like df[df['A'].contains(s)]. But I want to filter out rows where s is a superstring of value in df['A']. I have not been able to find an answer to this hence apologies in advance if a duplicate question exists.

Anna Iliukovich-Strakovskaia · Accepted Answer · 2020-09-29 08:36:02Z

2

I've created a sample example:

s = 'sas'
df = pd.DataFrame(data={'A':['s', 'a', 'sa', 'da']})

The solution is:

filtered_df = df[[s.find(i)!=-1 for i in df['A'].values]]

It isn't a pandas way, but it works. I hope it will be helpful.

Output:

    A
0   s
1   a
2  sa

UPD: If you don't want to add rows where s==i, you can modify this code like:

df[[((s.find(i)!=-1) and (s!=i)) for i in df['A'].values]]

edited Sep 29, 2020 at 8:36

answered Sep 29, 2020 at 7:54

Anna Iliukovich-Strakovskaia

1,4332 gold badges12 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Paper_Folding Over a year ago

Replace s.find(i)!=-1 with i in s also works.

JoeCondron · Accepted Answer · 2020-09-29 08:04:44Z

1

I don't think you can do better than a list comprehension:

df[[x in s for x in df.A]]

However, if df.A is repetitive (e.g. a categorical) then you could optimize it

key, unique_vals = pd.factorize(df.A)
mask = np.asarray([x in s for x in unique_vals])
df.loc[mask[key]]

Or if df.A is a categorical

key, unique_vals = df.A.cat.codes, df.A.cat.categories
mask = np.asarray([x in s for x in unique_vals])
df.loc[mask[key]]

answered Sep 29, 2020 at 8:04

JoeCondron

8,9163 gold badges29 silver badges28 bronze badges

Collectives™ on Stack Overflow

How to filter pandas rows based on if column values are substring of a given string?

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related