0

I need to filter rows on certain conditions on some columns. Those columns are present in a list. Condition will be same for all columns or can be different. For my work, condition is same.

Not working

labels = ['one', 'two', 'three']

df = df [df [x] == 1 for x in labels]  

Below code works:

df_list = []

for x in labels:

  df_list.append(df[(df [x] == 1)])

df5 = pd.concat(df_list).drop_duplicates()

Creating different dataframes and concating them by avoiding duplicates works.

Expected: It should filter out those rows where value of those column is 1.

Actual: ValueError: too many values to unpack (expected 1)

I understand the reason for the error. Is there any way I can construct the condition by modifying the not working code ?

1

2 Answers 2

2

I think you are able to re-write this using the following.

labels = ['one','two','three']

df5 = df[(df[labels] == 1).any(1)]

Let's test with this MCVE:

#Create test data
df = pd.DataFrame(np.random.randint(1,5,(10,5)), columns=[*'ABCDE'])
labels = ['A','B','E']

#Your code
df_list = []
for x in labels:

  df_list.append(df[(df [x] == 1)])

df5 = pd.concat(df_list).drop_duplicates()


#Suggested modification
df6 = df[(df[labels] == 1).any(1)]

Are they equal?

df5.eq(df6)

Output:

      A     B     C     D     E
1  True  True  True  True  True
4  True  True  True  True  True
6  True  True  True  True  True
7  True  True  True  True  True
8  True  True  True  True  True
Sign up to request clarification or add additional context in comments.

Comments

0

Do you need this ?

new_df = df[(df['one'] == 1) & (df['two']== 1) & (df['three'] == 1)].drop_duplicates()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.