1

I want to filter a dataframe according to whether any of several columns in a list match a test.

E.g., it can work in this way:

ddf = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
ddf[(ddf['A']==0)|(ddf['B']==0)|(ddf['C']==0)|(ddf['D']==0)]

...and one could build a loop if there are many more columns to process. But I wonder whether there's a more pythonic way to proceed, starting from the result of

ddf[list('ABCD')]==0

which gives 4 columns of booleans, over which I'd like to apply a or operation by row.

1 Answer 1

2

If it is the same test like whether it is zero, then you use any() across the rows:

np.random.seed(999)
ddf = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))

ddf[(ddf[['A','B','C','D']]==0).any(axis=1)]

     A   B   C   D
71  52  13   0  50
93  51   0  60  71
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.