1

I have a dataframe:

              A       B       C

1/2/2007    np.Nan   6.3     np.Nan
1/3/2007    np.Nan   np.Nan   3.2
1/4/2007    np.Nan   np.Nan   np.Nan
1/5/2007     -999    np.Nan   7.3
1/6/2007    np.Nan    -999    np.Nan
1/7/2007    np.Nan   np.Nan   3.1
1/8/2007    np.Nan    -999    np.Nan

How can I get a count of columns with the value -999?

Not the number of occurrences in the dataframe or the number of occurrances per column but the number of the columns with the value?

The correct answer here would be 2 since columns A and B have at least one occurance of -999 which column C has 0 occurrances.

1 Answer 1

3

You can first compare your entire df with -999, then count the number of none zero columns.

df.eq(-999).any().sum()
Out[302]: 2

A bit explanation:

df.eq(-999) gives a True/False matrix, indicating if the element is equal to -999:

Out[308]: 
              A      B      C
1/2/2007  False  False  False
1/3/2007  False  False  False
1/4/2007  False  False  False
1/5/2007   True  False  False
1/6/2007  False   True  False
1/7/2007  False  False  False
1/8/2007  False   True  False

.any() checks if any element in a column is True, which returns:

Out[309]: 
A     True
B     True
C    False
dtype: bool

In this example, Column A and B are true since they have at least one Trues.

Finally, .sum() counts the number of Trues, (True is automatically cast to 1 and False to 0 in a sum operation)

Sign up to request clarification or add additional context in comments.

2 Comments

aha, Nice modify :-)
Interesting - and thanks. Could you explain the syntax a bit. It isn't intuitive to me. How does the syntax 'know' we are talking about columns in this case instead of rows?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.