1

Need to merge a DataFrame with another DataFrame without affecting existing data

df1:

Name Subject mark
a Ta 52
b En
c Ma
d Ss 60

df2:

Name mark
b 57
c 58

Expected Output:

Name Subject mark
a Ta 52
b En 57
c Ma 58
d Ss 60

3 Answers 3

4

Use combine_first after setting Name as index:

df1.set_index('Name').combine_first(df2.set_index('Name')).reset_index()

output:

  Name Subject  mark
0    a      Ta  52.0
1    b      En  57.0
2    c      Ma  58.0
3    d      Ss  60.0
Sign up to request clarification or add additional context in comments.

Comments

1

Try using merge and combine_first:

>>> df = df1.merge(df2, on='Name', how='outer')
>>> df['mark'] = df.pop('mark_x').combine_first(df.pop('mark_y'))
>>> df
  Name Subject  mark
0    a      Ta  52.0
1    b      En  57.0
2    c      Ma  58.0
3    d      Ss  60.0
>>> 

2 Comments

Looks a bit complicated for a simple combine ;) combine first uses index for keys (see my answer)
@mozway Ah you nailed it... nice one +1
0

One of the ways in which you can achieve this is by using the below steps:

  1. Inner join the 2 tables using the pandas.merge()command.
  2. Create a new column which basically checks if the marks column from df1 is not None, then take that value, else, take df2 column value.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.