1

I have a dataframe df1 like this.

enter image description here

I want to fill the nan and the number 0 in column score with mutiple values in another dataframe df2 according to the different names.

enter image description here

How could I do this?

1
  • 2
    Thanks for accepting! Remember you can also upvote answers, so please consider upvoting the other answers. Commented Aug 25, 2017 at 15:07

4 Answers 4

4

Option 1
Short version

df1.score = df1.score.mask(df1.score.eq(0)).fillna(
    df1.name.map(df2.set_index('name').score)
)
df1

  name  score
0    A   10.0
1    B   32.0
2    A   10.0
3    C   30.0
4    B   20.0
5    A   45.0
6    A   10.0
7    A   10.0

Option 2
Interesting version using searchsorted. df2 must be sorted by 'name'.

i = np.where(np.isnan(df1.score.mask(df1.score.values == 0).values))[0]
j = df2.name.values.searchsorted(df1.name.values[i])
df1.score.values[i] = df2.score.values[j]
df1

  name  score
0    A   10.0
1    B   32.0
2    A   10.0
3    C   30.0
4    B   20.0
5    A   45.0
6    A   10.0
7    A   10.0
Sign up to request clarification or add additional context in comments.

1 Comment

1st time notice fillna can be in this way , thank you :)+1
2

If df1 and df2 are your dataframes, you can create a mapping and then call pd.Series.replace:

df1 = pd.DataFrame({'name' : ['A', 'B', 'A', 'C', 'B', 'A', 'A', 'A'], 
                     'score': [0, 32, 0, np.nan, np.nan, 45, np.nan, np.nan]})
df2 = pd.DataFrame({'name' : ['A', 'B', 'C'], 'score' : [10, 20, 30]})

print(df1)

  name  score
0    A    0.0
1    B   32.0
2    A    0.0
3    C    NaN
4    B    NaN
5    A   45.0
6    A    NaN
7    A    NaN

print(df2) 

  name  score
0    A     10
1    B     20
2    C     30

mapping = dict(df2.values)

df1.loc[(df1.score.isnull()) | (df1.score == 0), 'score'] =\
               df1[(df1.score.isnull()) | (df1.score == 0)].name.replace(mapping)

print(df1)

  name  score
0    A   10.0
1    B   32.0
2    A   10.0
3    C   30.0
4    B   20.0
5    A   45.0
6    A   10.0
7    A   10.0

2 Comments

dude! dict(df2.values) is pretty slick. I'll be stealing... borrowing that.
@piRSquared By all means!
1

Or using merge, fillna

import pandas as pd
import numpy as np

df1.loc[df.score==0,'score']=np.nan
df1.merge(df2,on='name',how='left').fillna(method='bfill',axis=1)[['name','score_x']]\
    .rename(columns={'score_x':'score'})

Comments

1

This method changes the order (the result will be sorted by name).

df1.set_index('name').replace(0, np.nan).combine_first(df2.set_index('name')).reset_index()

  name  score
0    A     10
1    A     10
2    A     45
3    A     10
4    A     10
5    B     32
6    B     20
7    C     30

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.