pandas filter by multiple columns NULL

Question

I have a pandas dataframe like:

df = pd.DataFrame({'Last_Name': ['Smith', None, 'Brown'], 
                   'First_Name': ['John', None, 'Bill'],
                   'Age': [35, 45, None]})

And could manually filter it using:

df[df.Last_Name.isnull() & df.First_Name.isnull()]

but this is annoying as I need to write a lot of duplicate code for each column/condition. It is not maintainable if there is a large number of columns. Is it possible to write a function which generates this python code for me?

Some Background: My pandas dataframe is based on an initial SQL-based multi-dimensional Aggregation (grouping-sets) https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-multi-dimensional-aggregation.html so always some different columns are NULL. Now, I want to efficiently select these different groups and analyze them separately in pandas.

cs95 · Accepted Answer · 2019-02-21 19:51:55Z

16

Use filter:

df[df.filter(like='_Name').isna().all(1)]

  Last_Name First_Name   Age
1      None       None  45.0

Or, if you want more flexibility, specify a list of column names.

cols = ['First_Name', 'Last_Name']
df[df[cols].isna().all(1)]

  Last_Name First_Name   Age
1      None       None  45.0

answered Feb 21, 2019 at 19:51

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

pandas filter by multiple columns NULL

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related