-1

I have two dataframes,

new1.
      Name       city
 0    sri won    chn
 1    pechi won  pune
 2    Ram won    mum
 0    pec won    kerala

new3
    req
0   pec
1   mut

I tried,

mask=new1.Name.str.contains("|".join(new3.req.values.tolist()))
new1[mask]

I am getting,

 new1[mask]
      Name       city
 1  pechi won    pune
 0  pec won      kerala

As "pechi" contains "pec", it took this valu. but I want the exact match between the values not "contains"

my desired output is,

 new1[mask]
      Name       city
 0  pec won      kerala

2 Answers 2

1

You need \b that means "word boundary":

a = r'\b(' + "|".join(new3.req.values.tolist()) + r')\b'
print (a)
\b(pec|mut)\b

mask=new1.Name.str.contains(a)
df = new1[mask]
print (df)
      Name    city
0  pec won  kerala
Sign up to request clarification or add additional context in comments.

2 Comments

wow!, It worked perfectly, could you please explain what the first line of code does ???
You can check this for explain word boundary (my English is horrible, especially for deep explanations)
0

You need space in separator

In [1350]: new1
Out[1350]:
        Name    city
0    sri won     chn
1  pechi won    pune
2    Ram won     mum
0    pec won  kerala

In [1351]: new3
Out[1351]:
   req
0  pec
1  mut

In [1352]: ' | '.join(new3.req)
Out[1352]: 'pec | mut'

In [1353]: new1.Name.str.contains(' | '.join(new3.req))
Out[1353]:
0    False
1    False
2    False
0     True
Name: Name, dtype: bool

In [1354]: new1[new1.Name.str.contains(' | '.join(new3.req))]
Out[1354]:
      Name    city
0  pec won  kerala

2 Comments

It gives the same result :(
You can check the example flow now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.