I have df with three columns a,b,c.I want change NaN values in column b. Eg: For the value of 123 in column a, column b has both abc and NaN. I want both to change to abc.
raw_data = {'a': [123, 123, 456, 456],
'b': [np.nan,'abc','def',np.nan],
'c':[np.nan,np.nan,0,np.nan]}
df = pd.DataFrame(raw_data, columns = ['a', 'b','c'])
a b c
0 123 NaN NaN
1 123 abc NaN
2 456 def 0
3 456 NaN NaN
My expected Output
df
a b c
1 123 abc NaN
0 123 abc NaN
2 456 def 0
3 456 def NaN
What i have tried:
df = df.sort_values(by=['a','b']).fillna(method='ffill')
But this changes the column c also.
Output from above:
a b c
1 123 abc NaN
0 123 abc NaN
2 456 def 0
3 456 def 0
How do i use ffill for a particular column or any other approaches recommended?
Sample Data 2:
raw_data = {'a': [123, 123, 456, 456,789,np.nan],
'b': [np.nan,'abc','def',np.nan,np.nan,'ghi'],
'c':[np.nan,np.nan,0,np.nan,np.nan,np.nan]}
df = pd.DataFrame(raw_data, columns = ['a', 'b','c'])
a b c
0 123.0 NaN NaN
1 123.0 abc NaN
2 456.0 def 0
3 456.0 NaN NaN
4 789.0 NaN NaN
5 NaN ghi abc
Expected Output
a b c
0 123.0 abc NaN
1 123.0 abc NaN
2 456.0 def 0
3 456.0 def NaN
4 789.0 NaN NaN
5 NaN ghi abc
bthen do onlyb:df['b'] = df.sort_values(by=['a','b']).fillna(method='ffill')['b'].