2

I have a pandas DataFrame like this:

In [34]: people = pandas.DataFrame({'name' : ['John', 'John', 'Mike', 'Sarah', 'Julie'], 'age' : [28, 18, 18, 2, 69]})
         people  = people[['name', 'age']]
         people

Out[34]:    
    name    age
0   John    28
1   John    18
2   Mike    18
3   Sarah   2
4   Julie   69

I want to filter this DataFrame using the following tuples:

In [35]: filter = [('John', 28), ('Mike', 18)]

The output should look like this:

Out[35]: 
    name    age
0   John    28
2   Mike    18

I've tried doing this:

In [34]: mask = k.isin({'name': ['John', 'Mike'], 'age': [28, 18]}).all(axis=1)
         k = k[mask]
         k

However it shows me both Johns because it filters each column independently (the ages of both Johns are present in the age array).

Out[34]: 
    name    age
0   John    28
1   John    18
2   Mike    18

How do I filter rows based on multiple fields taken together?

1 Answer 1

4

This should work:

people.set_index(people.columns.tolist(), drop=False).loc[filter].reset_index(drop=True)

Cleaned up and with explanation

# set_index with the columns you want to reference in tuples
cols = ['name', 'age']
people = people.set_index(cols, drop=False)
#                                   ^
#                                   |
#   ensure the cols stay in dataframe

#   does what you
#   want but now has
#   index that was
#   not there
# /--------------\
people.loc[filter].reset_index(drop=True)
#                 \---------------------/
#                  Gets rid of that index
Sign up to request clarification or add additional context in comments.

1 Comment

I gave a very simplistic example; I actually have a dataframe containing 5 million rows and 28 columns. This solution is fast and elegant :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.