Working with NaN values in multiple columns in Pandas

Question

I have multiple datasets with different number of rows and same number of columns. I would like to find Nan values in each column for example consider these two datasets:

dataset1 :            dataset2:
a  b                  a    b
1  10                 2    11
2  9                  3    12
3  8                  4    13
4  nan                nan  14
5  nan                nan  15
6  nan                nan  16

I want to find nan values in two datasets a and b : if it occurs in column b then remove all the rows that have nan values. and if it occurs in column a then fill that values with 0.

this is my snippet code:

a=pd.notnull(data['a'].values.any())
b= pd.notnull((data['b'].values.any()))
if a:
     data = data.dropna(subset=['a'])
if b:
     data[['a']] = data[['a']].fillna(value=0)

which does not work properly.

FWIW: its recommended to use pd.notna instead of notnull because notnull doesn't capture all variations of nan — mithunpaul
– mithunpaul, Commented Jun 17, 2021 at 20:35

Vaishali · Accepted Answer · 2017-11-06 22:50:38Z

4

You just need fillna and dropna without control flow

data = data.dropna(subset=['b']).fillna(0)

answered Nov 6, 2017 at 22:50

Vaishali

38.5k5 gold badges62 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BENY · Accepted Answer · 2017-11-06 23:18:59Z

2

Pass your condition to a dict

df=df.fillna({'a':0,'b':np.nan}).dropna()

You do not need 'b' here

df=df.fillna({'a':0}).dropna()

EDIT :

df.fillna({'a':0}).dropna()
Out[1319]: 
     a   b
0  2.0  11
1  3.0  12
2  4.0  13
3  0.0  14
4  0.0  15
5  0.0  16

edited Nov 6, 2017 at 23:18

answered Nov 6, 2017 at 22:56

BENY

324k22 gold badges176 silver badges250 bronze badges

6 Comments

Elham Over a year ago

Still have nan values in column b

piRSquared Over a year ago

@AlterNative you can only choose one as the accepted answer (-:

piRSquared Over a year ago

You don't need the 'b' key at all. df.fillna({'a': 0}).dropna()

Elham Over a year ago

it does not work on my second dataset. It removes all the rows, I need to keep them and set the values for the rows in column a to 0.

BENY Over a year ago

@AlterNative have nice day :-)

|

Collectives™ on Stack Overflow

Working with NaN values in multiple columns in Pandas

2 Answers 2

Comments

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related