How to count rows based on multiple column conditions using pandas?

Question

How can I count csv file rows with pandas using & and or condition?

In the below code I want to count all rows that have True/False=FALSE and status = OK, and have '+' value in any of those columns openingSoon, underConstruction, comingSoon.

I've tried:

checkOne =  df['id'].loc[(df['True/False'] == 'FALSE') & (df['status'] == 'OK') & (df['comingSoon'] == '+') or (df['openingSoon'] == '+') or (df['underConstruction'] == '+')].count()

error:

Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/usr/local/lib/python3.9/site-packages/pandas/core/generic.py", line 1329, in __nonzero__
raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

jezrael · Accepted Answer · 2020-12-21 10:44:46Z

3

Use | for bitwise or and for count Trues values filtering not necessary, use sum:

Also for testing boolean df['True/False'] == False is possible simplify by ~df['True/False']

checkOne =  (~df['True/False'] & 
             (df['status'] == 'OK') & 
             (df['comingSoon'] == '+') | 
             (df['openingSoon'] == '+') | 
             (df['underConstruction'] == '+')).sum()

If True/False are strings TRUE/FALSE use:

checkOne =  ((df['True/False'] == 'FALSE') & 
             (df['status'] == 'OK') & 
             (df['comingSoon'] == '+') | 
             (df['openingSoon'] == '+') | 
             (df['underConstruction'] == '+')).sum()

edited Dec 21, 2020 at 10:44

answered Dec 21, 2020 at 10:17

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Baobab1988 Over a year ago

Hi @jezrael, the above gives me this error:

Traceback (most recent call last):   File "<stdin>", line 1, in <module>   File "/usr/local/lib/python3.9/site-packages/pandas/core/generic.py", line 1324, in __invert__     new_data = self._mgr.apply(operator.invert)   File "/usr/local/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 407, in apply     applied = b.apply(f, **kwargs)   File "/usr/local/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 346, in apply     result = func(self.values, **kwargs) TypeError: bad operand type for unary ~: 'str'

Baobab1988 Over a year ago

and when I try without ~ then I get this error: TypeError: unsupported operand type(s) for &: 'str' and 'bool'

jezrael Over a year ago

@Baobab1988 - It means FALSE are not boolean, but strings, so need (df['True/False'] == 'FALSE')

Baobab1988 Over a year ago

I've tried as you suggested, but now it throws this error: TypeError: Cannot perform 'rand_' with a dtyped [bool] array and scalar of type [bool]

jezrael Over a year ago

@Baobab1988 - So try change df['True/False'] == 'FALSE') to df['True/False'] == 'False')

|

Collectives™ on Stack Overflow

How to count rows based on multiple column conditions using pandas?

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related