0

I am dealing with a huge dataframe with hundreds of columns with possibility of missing values in each of the columns. Here is sample:

import pandas as pd
import numpy as np

data = {'a':  [1,1,0,1,1],
        'b': ["a", "b", np.nan, 'c', np.nan],
        'c': ['b1','b2',np.nan, 'c1', np.nan],
        'd': [1,1,1, 2, np.nan],
        'e': [4,4,4, 3, np.nan]
       }
df = pd.DataFrame(data)
print(df)

   a    b    c    d    e
0  1    a   b1  1.0  4.0
1  1    b   b2  1.0  4.0
2  0  NaN  NaN  1.0  4.0
3  1    c   c1  2.0  3.0
4  1  NaN  NaN  NaN  NaN

In order to deal with missing values at once, I am doing something like this. Which basically if the missing values are in one the a,b, or c columns, then I replace them with a specific value.

df=df.fillna({'a':0, 'b':'other', 'c':-1})
print (df)
   a      b   c    d    e
0  1      a  b1  1.0  4.0
1  1      b  b2  1.0  4.0
2  0  other  -1  1.0  4.0
3  1      c  c1  2.0  3.0
4  1  other  -1  NaN  NaN

What I would like to do is if the missing values in any other columns than those three columns, then simply replace the missing values with a value that appears the most often in that column. For example, in column d, 1 is repeated the most so I simply replace missing value in with 1.0.

1
  • what happens if you have a tie? eg. with [1, 2, 1, 2, NaN]? Commented Jan 5, 2023 at 18:00

1 Answer 1

1

Assuming you have a single mode or are fine with getting the first value:

d = {'a':0, 'b':'other', 'c':-1}
d2 = df.drop(columns=list(d)).mode().loc[0].to_dict()

out = df.fillna(d|d2) # requires python 3.9+

# for 3.5 <= python < 3.9
# out = df.fillna({**d, **d2})

Output:

   a      b   c    d    e
0  1      a  b1  1.0  4.0
1  1      b  b2  1.0  4.0
2  0  other  -1  1.0  4.0
3  1      c  c1  2.0  3.0
4  1  other  -1  1.0  4.0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.