20

I am using Python and Pandas. I have a df that works similar to this:

 +--------+--------+-------+
 |  Col1  |  Col2  | Col3 |
 +--------+--------+-------+
 | Team 1 | High   | Pizza |
 | Team 1 | Medium | Sauce |
 | Team 1 | Low    | Crust |
 +--------+--------+-------+

I would like to filter the df so that I only see High or Medium from Col2.

This is what I have tried with no luck

 df = df.loc[df['Col2'] == 'High' | (df['Col2'] == 'Medium')]

This is the error I am getting

 cannot compare a dtyped [bool] array with a scalar of type [bool]

Any ideas how to make this work and what that error means?

0

4 Answers 4

103

.isin() works as well, more pythonic

country_list = ['brazil', 'poland', 'russia', 'countrydummy', 'usa']

filtered_df = df[df['Country Name'].isin(country_list)]
print(filtered_df)
Sign up to request clarification or add additional context in comments.

1 Comment

this solution is better because you can use dynamic length of country_list
29

You are missing a pair of parentheses to get comparable items on both sides of the | operator - which has higher precedence than ==:

df = df.loc[(df['Col 2'] == 'High') | (df['Col2'] == 'Medium')]

Comments

4

You can also use ( for Pandas >= 0.13 ) :

filtered_df = df.query( '"Country Name" == ["brazil","poland","russia","countrydummy","usa"]' )

print(filtered_df )

Comments

1

I think that df.query is the best way for this kind of things

df = df.query("Col2 == ['High','Medium']")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.