1

I need to be able to add the values of two dataframes with the same structure together and form a new dataframe as a result.

e.g. DF1 + DF2 = DF3

DF1
+------------+----+----+----+
|    date    |  A |  B |  C |
+------------+----+----+----+
| 2017-01-01 | 24 | 15 |  4 |
| 2017-01-02 | 31 | 10 | 12 |
| 2017-01-03 |  9 | 47 |  3 |
+------------+----+----+----+

DF2
+------------+----+----+----+
|    date    |  A |  B |  C |
+------------+----+----+----+
| 2017-01-01 |  4 | 12 | 63 |
| 2017-01-02 | 23 |  0 | 31 |
| 2017-01-03 | 61 | 22 | 90 |
+------------+----+----+----+

DF3
+------------+----+----+----+
|    date    |  A |  B |  C |
+------------+----+----+----+
| 2017-01-01 | 28 | 27 | 67 |
| 2017-01-02 | 64 | 10 | 43 |
| 2017-01-03 | 70 | 69 | 93 |
+------------+----+----+----+

I've been trying to work out how to do this but i'm getting a TypeError

TypeError: unsupported operand type(s) for +: 'datetime.date' and 'datetime.date'

when trying to do:

df3 = df1.add(df2, fill_value=0)

I'm sure i'm missing something simple as it appears to be trying to add the first columns (which is a date and the column I want to match on to add together the values for all other columns) but any help would be greatly appreciated.

3
  • if you take out the date column and add it later on, it should work. is this a possiblity given the structure of your data? it will work in the above case. Commented Apr 11, 2017 at 16:21
  • The dates need to remain linked to the data otherwise the data isn't any good. I suppose basically what I need is to add up the corresponding columns based on the date matching between the datafrmes Commented Apr 11, 2017 at 16:26
  • Another option would be using concat, groupby and sum: df3 = pd.concat([df1, df2]).groupby('date').sum().reset_index() Commented Apr 11, 2017 at 16:35

1 Answer 1

3

You want the date columns to be indices, not normal columns:

df3 = df1.set_index('date').add(df2.set_index('date'), fill_value=0).reset_index()

You use set_index() so that the date columns becomes indices. If you don't want the final dataframe to be date-indexed, you can use reset_index() at the end as @MaxU suggests.

Sign up to request clarification or add additional context in comments.

2 Comments

i'd also add .reset_index() at the end ;-)
@ASGM, thanks very much for your help. Adding the indices was the solution I needed :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.