Python - str.match for each string in a dataframe

Question

I'm trying to use str.match to match a phrase exactly, but for each word in each row's string. I want to return the row's index number for the correct row, which is why I'm using str.match instead of regex.

I want to return the index for the row that contains exactly 'FL', not 'FLORIDA'. The problem with using str.contains though, is that it returns to me the index of the row with 'FLORIDA'.

import pandas as pd
data = [['Alex in FL','ten'],['Bob in FLORIDA','five'],['Will in GA','three']]
df = pd.DataFrame(data,columns=['Name','Age'])

df.index[df['Name'].str.contains('FL')]
df.index[df['Name'].str.match('FL')]

Here's what the dataframe looks like:

    Name             Age
0   Alex in FL       ten
1   Bob in FLORIDA   five
2   Will in GA       three

The output should be returning the index of row 0: Int64Index([0], dtype='int64')

Dani Mesejo · Accepted Answer · 2019-01-03 21:37:16Z

3

Use contains with word boundaries:

import pandas as pd

data = [['Alex in FL','ten'],['Bob in FLORIDA','five'],['Will in GA','three']]
df = pd.DataFrame(data,columns=['Name','Age'])

print(df.index[df['Name'].str.contains(r'\bFL\b')])

Output

Int64Index([0], dtype='int64')

answered Jan 3, 2019 at 21:37

Dani Mesejo

62.2k6 gold badges56 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Scott Boston · Accepted Answer · 2019-01-03 21:40:24Z

1

Try:

df[df.Name.str.contains(r'\bFL\b', regex=True)]

OR

df[['FL' in i for i in df.Name.str.split('\s')]]

Output:

         Name  Age
0  Alex in FL  ten

answered Jan 3, 2019 at 21:40

Scott Boston

154k15 gold badges160 silver badges207 bronze badges

Comments

Andrew F · Accepted Answer · 2019-01-03 21:38:23Z

0

The docs say that it's matching Regex with the expression ("FL" in your case). Since "FLORIDA" does contain that substring, it does match.

One way you could do this would be to match instead for " FL " (padded with space) but you would also need to pad each of the values with spaces as well (for when "FL" is the end of the string).

answered Jan 3, 2019 at 21:38

Andrew F

2,9701 gold badge17 silver badges25 bronze badges

Collectives™ on Stack Overflow

Python - str.match for each string in a dataframe

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related