1

I have a dataframe with missing values for some columns:

a = pd.DataFrame(data = {"name":['bob','sue','dave'],'status':[np.NaN,np.NaN,'A'],'team':['red','blue',np.NaN]},index=[100,101,105])

Dataframe a

dataframe a

I have another dataframe with the same index where some of the missing values have been replaced:

b = pd.DataFrame(data = {"name":['bob','sue','dave'],'status':['I','O','A'],'team':['red','blue',np.NaN]},index=[100,101,105])

Dataframe b

dataframe b

Is there a way to map dataframe b to a so that the values for specific columns in a are replaced? There are lots of other rows in a that aren't in b so I only want to replace the rows that have the same index.

I tried this but it sets the values to np.NaN:

a['status'] = a['status'].map(b['status'])
a['team'] = a['team'].map(b['team'])

Dataframe a after mapping

dataframe a after b has been mapped

1
  • 1
    a.combine_first(b) ? Commented Aug 13, 2020 at 15:41

2 Answers 2

1

It can be done by using the slice operation. The index of the second dataframe is used to slice the first dataframe

You then assign the second dataframe.

a.loc[b.index] = b

Output:

     name status  team
100   bob      I   red
101   sue      O  blue
105  dave      A   NaN
Sign up to request clarification or add additional context in comments.

Comments

1

All credit to @Sushanth

>>> a.combine_first(b)
     name status  team
100   bob      I   red
101   sue      O  blue
105  dave      A   NaN

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.