Filter data iteratively in Python data frame

Question

I'm wondering about existing pandas functionalities, that I might not been able to find so far.

Bascially, I have a data frame with various columns. I'd like to select specific rows depending on the values of certain colums (FYI: i was interested in the value of column D, that had several parameters described in A-C).

E.g. I want to know which row(s) have A==1 & B==2 & C==5?

df
   A  B  C  D
0  1  2  4  a
1  1  2  5  b
2  1  3  4  c

df_result
1  1  2  5  b

So far I have been able to basically reduce this:

import pandas as pd

df = pd.DataFrame({'A': [1,1,1],
                   'B': [2,2,3],
                   'C': [4,5,4],
                   'D': ['a', 'b', 'c']})
df_A = df[df['A'] == 1]
df_B = df_A[df_A['B'] == 2]
df_C = df_B[df_B['C'] == 5]

To this:

parameter = [['A', 1],
             ['B', 2],
             ['C', 5]]

df_filtered = df
for x, y in parameter:
    df_filtered = df_filtered[df_filtered[x] == y]

which yielded the same results. But I wonder if there's another way? Maybe without loop in one line?

You can compound your conditions df[(df['A'] == 1) & (df['B'] == 2) & (df['C'] == 5)] without using a loop — EdChum
– EdChum, Commented Mar 7, 2016 at 15:10
But what if I don't know beforehand how my columns are called and which values I want them to have? — fukiburi
– fukiburi, Commented Mar 7, 2016 at 15:11
What do you mean? You must have some idea at some point which columns and values to compare? You can construct the conditions easily — EdChum
– EdChum, Commented Mar 7, 2016 at 15:14
My data frame is generated from a csv-file. So until I've actually loaded the file, I don't know how the columns were named. I do know what values I want to them to have, but since I want to generate several subdata sets I also load the values from a different file, where I've noted them. Right now I store a bunch of parameter combinations like the variable parameter that I loop through. — fukiburi
– fukiburi, Commented Mar 7, 2016 at 15:17
I guess it would be easier to have conditions like A==1 and B==2 and C==5 instead of your parameter list and then just query rows satisfying this condition like @John Galt showed by df.query() function... — MaxU - stand with Ukraine
– MaxU - stand with Ukraine, Commented Mar 7, 2016 at 16:01

Zero · Accepted Answer · 2016-03-07 15:35:51Z

1

You could use query() method to filter data, and construct filter expression from parameters like

In [288]: df.query(' and '.join(['{0}=={1}'.format(x[0], x[1]) for x in parameter]))
Out[288]:
   A  B  C  D
1  1  2  5  b

Details

In [296]: df
Out[296]:
   A  B  C  D
0  1  2  4  a
1  1  2  5  b
2  1  3  4  c

In [297]: query = ' and '.join(['{0}=={1}'.format(x[0], x[1]) for x in parameter])

In [298]: query
Out[298]: 'A==1 and B==2 and C==5'

In [299]: df.query(query)
Out[299]:
   A  B  C  D
1  1  2  5  b

edited Mar 7, 2016 at 15:35

answered Mar 7, 2016 at 15:30

Zero

77.4k22 gold badges153 silver badges153 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

fukiburi Over a year ago

Wow, thank you! I didn't know about query. How would I have to change the code if I'd like to compare strings instead of integers? If changing all the values to strings, df.query() returns an empty DataFrame...

fukiburi Over a year ago

Ah, I figured it out! Just replaced '{0}=={1}' by '{0}==\"{1}\"'.

Patrick the Cat · Accepted Answer · 2016-03-07 15:39:59Z

0

Just for the information if others are interested, I would have done it this way:

import numpy as np
matched = np.all([df[vn] == vv for vn, vv in parameters], axis=0)
df_filtered = df[matched]

But I like the query function better, now that I have seen it @John Galt.

answered Mar 7, 2016 at 15:39

Patrick the Cat

2,1781 gold badge17 silver badges35 bronze badges

1 Comment

fukiburi Over a year ago

Still, thank you for your input! I'll keep this method in mind, too. Could be useful in the future.

Collectives™ on Stack Overflow

Filter data iteratively in Python data frame

2 Answers 2

2 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related