1

I'm trying to match a column in a DataFrame to one of a list of substrings.

e.g. take the column (strings) with the following values:

text1C1
text2A
text2
text4
text4B
text4A3

And create a new column which has matched them to the following substrings:

vals = ['text1', 'text2', 'text3', 'text4', 'text4B']

The code I have at the moment works, but it seems like a really inefficient way of solving the problem.

df = pd.DataFrame({'strings': ['text1C1', 'text2A', 'text2', 'text4', 'text4B', 'text4A3']})


for v in vals:
        df.loc[df[df['strings'].str.contains(v)].index, 'matched strings'] = v

This returns the following DataFrame, which is what I need.

   strings    matched strings
0  text1C1              text1
1   text2A              text2
2    text2              text2
3    text4              text4
4   text4B             text4B
5  text4A3              text4

Is there a more efficient way of doing this especially for larger DataFrames (10k+ rows)?

I cant think of how to deal with one of the items of vals also being a substring of another (text4 is a substring of text4B)

1 Answer 1

2

Use generator with next for match first value:

s = vals[::-1]
df['matched strings1'] = df['strings'].apply(lambda x: next(y for y in s if y in x))
print (df)
   strings matched strings matched strings1
0  text1C1           text1            text1
1   text2A           text2            text2
2    text2           text2            text2
3    text4           text4            text4
4   text4B          text4B           text4B
5  text4A3           text4            text4

More general solution if possible no matched values with iter and default parameter of next:

f = lambda x: next(iter(y for y in s if y in x), 'no match')
df['matched strings1'] = df['strings'].apply(f)

Your solution should be improved:

for v in vals:
    df.loc[df['strings'].str.contains(v, regex=False), 'matched strings'] = v
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.