Pandas.Drop all columns with missing values except 1 column

Question

Suppose we have a dataframe with following columns 'Age', 'Name', 'Sex', where 'Age' and 'Sex' contain missing values. I want to drop all columns with missing values except one column 'Age'. So that I have a df with 2 columns 'Name' and 'Age'. How can I do it ?

mrhd · Accepted Answer · 2020-04-19 09:18:07Z

3

This should do what you need:

import pandas as pd
import numpy as np

df = pd.DataFrame({
  'Age'  : [5,np.nan,12,43], 
  'Name' : ['Alice','Bob','Charly','Dan'],
  'Sex'  : ['F','M','M',np.nan]})

df_filt = df.loc[:,(-df.isnull().any()) | (df.columns.isin(['Age']))]

Explanation:

df.isnull().any()) checks for all columns if any value is None or NaN, the - means that only those columns are selected that do not meet that criterion.

df.columns.isin(['Age']) checks for all columns if their name is 'Age', so that this column is selected in any case.

Both conditions are connected by an OR (|) so that if either condition applies the column is selected.

edited Apr 19, 2020 at 9:18

answered Apr 19, 2020 at 8:54

mrhd

1,0867 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pandas.Drop all columns with missing values except 1 column

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related