I have a dataframe df1 like this.
I want to fill the nan and the number 0 in column score with mutiple values in another dataframe df2 according to the different names.
How could I do this?
I have a dataframe df1 like this.
I want to fill the nan and the number 0 in column score with mutiple values in another dataframe df2 according to the different names.
How could I do this?
Option 1
Short version
df1.score = df1.score.mask(df1.score.eq(0)).fillna(
df1.name.map(df2.set_index('name').score)
)
df1
name score
0 A 10.0
1 B 32.0
2 A 10.0
3 C 30.0
4 B 20.0
5 A 45.0
6 A 10.0
7 A 10.0
Option 2
Interesting version using searchsorted. df2 must be sorted by 'name'.
i = np.where(np.isnan(df1.score.mask(df1.score.values == 0).values))[0]
j = df2.name.values.searchsorted(df1.name.values[i])
df1.score.values[i] = df2.score.values[j]
df1
name score
0 A 10.0
1 B 32.0
2 A 10.0
3 C 30.0
4 B 20.0
5 A 45.0
6 A 10.0
7 A 10.0
fillna can be in this way , thank you :)+1If df1 and df2 are your dataframes, you can create a mapping and then call pd.Series.replace:
df1 = pd.DataFrame({'name' : ['A', 'B', 'A', 'C', 'B', 'A', 'A', 'A'],
'score': [0, 32, 0, np.nan, np.nan, 45, np.nan, np.nan]})
df2 = pd.DataFrame({'name' : ['A', 'B', 'C'], 'score' : [10, 20, 30]})
print(df1)
name score
0 A 0.0
1 B 32.0
2 A 0.0
3 C NaN
4 B NaN
5 A 45.0
6 A NaN
7 A NaN
print(df2)
name score
0 A 10
1 B 20
2 C 30
mapping = dict(df2.values)
df1.loc[(df1.score.isnull()) | (df1.score == 0), 'score'] =\
df1[(df1.score.isnull()) | (df1.score == 0)].name.replace(mapping)
print(df1)
name score
0 A 10.0
1 B 32.0
2 A 10.0
3 C 30.0
4 B 20.0
5 A 45.0
6 A 10.0
7 A 10.0
dict(df2.values) is pretty slick. I'll be stealing... borrowing that.