43

I have a pandas DataFrame df:

import pandas as pd

data = {"Name": ["AAAA", "BBBB"],
        "C1": [25, 12],
        "C2": [2, 1],
        "C3": [1, 10]}

df = pd.DataFrame(data)
df.set_index("Name")

which looks like this when printed (for reference):

      C1  C2  C3
Name            
AAAA  25   2   1
BBBB  12   1  10

I would like to choose rows for which C1, C2 and C3 have values between 0 and 20.

Can you suggest an elegant way to select those rows?

0

4 Answers 4

58

I think below should do it, but its elegance is up for debate.

new_df = old_df[((old_df['C1'] > 0) & (old_df['C1'] < 20)) & ((old_df['C2'] > 0) & (old_df['C2'] < 20)) & ((old_df['C3'] > 0) & (old_df['C3'] < 20))]
Sign up to request clarification or add additional context in comments.

5 Comments

Is there a way to use 'or' other than '&'
Love the elegance note :D
Use | in place of & for an "or" condition.
Note that even after nearly 9 years, it still won't work without the parentheses (even for two conditions), and one could argue, that it would be more intuitive.
This can be simplified using the 'between' function. Than you can drop half the conditions and the parentheses. df[df['C1'].between(0, 20) & df[df['C2'].between(0, 20) & df[df['C3'].between(0, 20)]
27

Shorter version:

In [65]:

df[(df>=0)&(df<=20)].dropna()
Out[65]:
   Name  C1  C2  C3
1  BBBB  12   1  10

Comments

23

I like to use df.query() for these kind of things

df.query('C1>=0 and C1<=20 and C2>=0 and C2<=20 and C3>=0 and C3<=20')

Comments

12

A more concise df.query:

df.query("0 <= C1 <= 20 and 0 <= C2 <= 20 and 0 <= C3 <= 20")

or

df.query("0 <= @df <= 20").dropna()

Using @foo in df.query refers to the variable foo in the environment.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.