1

I have a pandas dataframe with 200+ columns. I'm trying to inspect all the columns with null data. How can I filter/display the columns which have null data? df.isnull().sum() lists count of all columns, but I want to see only columns with non-zero null data count as the number of columns is high.

2
  • can we see an example dataframe and expected output? Commented Nov 4, 2018 at 2:02
  • @ user2305776 , pls see the detailed answer below don't forget to accept if it helps. Commented Nov 4, 2018 at 4:22

2 Answers 2

5

Newer Pandas versions have new methods DataFrame.isna() and DataFrame.notna()

1) Using DataFrame.isna() method !

>>> df
    A     B     C     D  E      F
0   0   1.0   2.0     3  4    one
1   3   5.0   NaN   NaT  5    two
2   8   NaN  10.0  None  6  three
3  11  12.0  13.0   NaT  7   four

To get Just the List of Columns which are null values:

>>> df.columns[df.isna().any()].tolist()
['B', 'C', 'D']

To list down all the columns which are having nan values.

>>> df.loc[:, df.isna().any()]
      B     C     D
0   1.0   2.0     3
1   5.0   NaN   NaT
2   NaN  10.0  None
3  12.0  13.0   NaT

2) Using DataFrame.isnull() method !

To get Just the List of Columns which are null values, returns type is boolean.

>>> df.isnull().any()
A    False
B     True
C     True
D     True
E    False
F    False
dtype: bool

To get Just the List of Columns which are null having values:

>>> df.columns[df.isnull().any()].tolist()
['B', 'C', 'D']

To select a subset - all columns containing at least one NaN value:

>>> df.loc[:, df.isnull().any()]
      B     C     D
0   1.0   2.0     3
1   5.0   NaN   NaT
2   NaN  10.0  None
3  12.0  13.0   NaT

If you want to count the missing values in each column:

>>> df.isnull().sum()
A    0
B    1
C    1
D    3
E    0
F    0
dtype: int64

OR

>>> df.isnull().sum(axis=0)  # axis=0 , across the columns
A    0
B    1
C    1
D    3
E    0
F    0

# >>> df.isnull().sum(axis=1)  # across the rows

Finally, to get the total number of NaN & non NaN values in the DataFrame:

Nan value counts

>>> df.isnull().sum().sum()

Non NaN value count

>>> df.notnull().sum().sum()
Sign up to request clarification or add additional context in comments.

2 Comments

Thx for showing the various options available. Since I wanted to see "only those columns with null values and their count", I selected the other answer. Thx for your response.
@user2305776, no problem , by the way i edited my post in case you are looking for total nul counts.
1

Once you've got the counts, just filter on the entries greater than zero:

null_counts = df.isnull().sum()
null_counts[null_counts > 0]

1 Comment

Simple and nice! Thx

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.