Replace data from one pandas dataframe to another

Question

I have two dataframes df1 and df2 . They both contain time-series data, so it is possible some of the dates in df1 and df2 intersect with each other and the rest don't. My requirement is an operation on the two dataframes that replaces the values in df1 with the values in df2 for the same dates, leaves alone values for indexes in df1 not present in df2 and adds the values for indexes present in df2 and not in df1. Consider the following example:

df1:
    A   B   C   D
0   A0  BO  C0  D0
1   A1  B1  C1  D1
2   A2  B2  C2  D2
3   A3  B3  C3  D3

df2:
    A   B   C   E
1   A4  B4  C4  E4
2   A5  B5  C5  E5
3   A6  B6  C6  E6
4   A7  B7  C7  E7

result df:
    A   B   C   D   E
0   A0  BO  C0  D0  Nan
1   A4  B4  C4  D4  E4
2   A5  B5  C5  D5  E5
3   A6  B6  C6  D6  E6
4   A7  B7  C7  D7  E7

I tried to develop the logic with the first step concatenating the two dfs but that leads to rows with duplicate indexes and am not sure how to handle that. How can this be achieved? Any suggestions would help

Edit: A simpler case would be when the column names are same in the two dataframes. So consider df2 has column D instead of E with values D4,D5,D6,D7.

A concatenation yields the following result:

concat(df1,df2,axis=1)
    A    B    C    D    A    B    C    D
0   A0   B0   C0   D0  NaN  NaN  NaN  NaN  
1   A1   B1   C1   D1   A4   B4   C4   D4
2   A2   B2   C2   D2   A5   B5   C5   D5
3   A3   B3   C3   D3   A6   B6   C6   D6
4  NaN  NaN  NaN  NaN   A7   B7   C7   D7

Now this introduces duplicate columns. A conventional solution would be to loop through each column but I am looking for a more elegant solution. Any ideas would be appreciated.

The problem with this setup is that the DataFrames won't align on columns D & E. — Alexander
– Alexander, Commented May 24, 2015 at 1:11
for simplicity sake we can ignore the column E and assume they have the same columns , how would this operation then be achieved, given df2 had column D instead of E with values D4-D7 — john smith
– john smith, Commented May 24, 2015 at 1:16

Alexander · Accepted Answer · 2015-05-24 01:41:14Z

8

update will align on the indices of both DataFrames:

df1.update(df2)

df1:
    A   B   C   D
0   A0  BO  C0  D0
1   A1  B1  C1  D1
2   A2  B2  C2  D2
3   A3  B3  C3  D3

df2:
    A   B   C   D
1   A4  B4  C4  D4
2   A5  B5  C5  D5
3   A6  B6  C6  D6
4   A7  B7  C7  D7

>>> df1.update(df2)
    A   B   C   D
0  A0  BO  C0  D0
1  A4  B4  C4  D4
2  A5  B5  C5  D5
3  A6  B6  C6  D6

You then need to add the values in df2 not present in df1:

>>> df1.append(df2.loc[[i for i in df2.index if i not in df1.index], :])
Out[46]: 
    A   B   C   D
0  A0  BO  C0  D0
1  A4  B4  C4  D4
2  A5  B5  C5  D5
3  A6  B6  C6  D6
4  A7  B7  C7  D7

edited May 24, 2015 at 1:41

answered May 24, 2015 at 0:57

Alexander

111k32 gold badges212 silver badges208 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

john smith Over a year ago

that solves one part of the problem definitely alexander, which is replace values in df1 with values in df2 for the same indexes. However we do not have a track of indexes in df2 in the above case -- 4 which is not present in df1. The final result should contain that as well

Alexander Over a year ago

Amended response to include that as well.

john smith Over a year ago

yes that works perfectly, thanks. I figured there is no other way but to loop the second dataframe to achieve the desired output. Your help has been greatly appreciated thanks!

Alexander Over a year ago

You may not have noticed, but you now have enough reputation to upvote the response (-;

Community · Accepted Answer · 2017-05-23 12:22:23Z

2

I just saw this question and realized it is almost identical to one that I just asked today and that @Alexander (the poster of the answer above) answered very nicely:

pd.concat([df1[~df1.index.isin(df2.index)], df2])

See pandas DataFrame concat / update ("upsert")? for the discussion.

edited May 23, 2017 at 12:22

CommunityBot

11 silver badge

answered Oct 8, 2015 at 3:38

embeepea

6671 gold badge6 silver badges12 bronze badges

Collectives™ on Stack Overflow

Replace data from one pandas dataframe to another

2 Answers 2

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related