1

I'm trying to use str.match to match a phrase exactly, but for each word in each row's string. I want to return the row's index number for the correct row, which is why I'm using str.match instead of regex.

I want to return the index for the row that contains exactly 'FL', not 'FLORIDA'. The problem with using str.contains though, is that it returns to me the index of the row with 'FLORIDA'.

import pandas as pd
data = [['Alex in FL','ten'],['Bob in FLORIDA','five'],['Will in GA','three']]
df = pd.DataFrame(data,columns=['Name','Age'])

df.index[df['Name'].str.contains('FL')]
df.index[df['Name'].str.match('FL')]

Here's what the dataframe looks like:

    Name             Age
0   Alex in FL       ten
1   Bob in FLORIDA   five
2   Will in GA       three

The output should be returning the index of row 0: Int64Index([0], dtype='int64')

3 Answers 3

3

Use contains with word boundaries:

import pandas as pd

data = [['Alex in FL','ten'],['Bob in FLORIDA','five'],['Will in GA','three']]
df = pd.DataFrame(data,columns=['Name','Age'])

print(df.index[df['Name'].str.contains(r'\bFL\b')])

Output

Int64Index([0], dtype='int64')
Sign up to request clarification or add additional context in comments.

Comments

1

Try:

df[df.Name.str.contains(r'\bFL\b', regex=True)]

OR

df[['FL' in i for i in df.Name.str.split('\s')]]

Output:

         Name  Age
0  Alex in FL  ten

Comments

0

The docs say that it's matching Regex with the expression ("FL" in your case). Since "FLORIDA" does contain that substring, it does match.

One way you could do this would be to match instead for " FL " (padded with space) but you would also need to pad each of the values with spaces as well (for when "FL" is the end of the string).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.