0

I have two dataframes

tickers                        dt  AAPL  AMC  AMZN  ...  TH  TSLA  VIAC  WKHS
0       2021-03-22 00:00:00+00:00     0    0     0  ...   0     1     0     0
1       2021-03-23 00:00:00+00:00     0    0     0  ...   0     1     0     0
2       2021-03-24 00:00:00+00:00     1    2     0  ...   0     0     0     0
3       2021-03-25 00:00:00+00:00     0    0     0  ...   0     0     0     0
4       2021-03-26 00:00:00+00:00     0    2     0  ...   0     4     0     0
tickers                        dt  AAPL  AMC  AMZN  ...  TH  TSLA  VIAC  WKHS
0       2021-03-19 00:00:00+00:00     0    0     0  ...   0     0     0     0
1       2021-03-20 00:00:00+00:00     0    0     0  ...   0     0     0     0
2       2021-03-21 00:00:00+00:00     0    0     0  ...   0     0     0     0
3       2021-03-22 00:00:00+00:00     0    0     0  ...   0     3     0     0
4       2021-03-23 00:00:00+00:00     0    0     0  ...   0     3     0     0

I want to sum each row by corresponding row from another dataframe. You can see that some time from one dataframe doesn't exist in another dataframe. I also want to consider them and include in the new dataframe

2
  • Try this: stackoverflow.com/a/23361783/2612429 Commented Aug 17, 2021 at 9:40
  • yeah but I also want to consider when they don't intersect Commented Aug 17, 2021 at 9:47

2 Answers 2

3

Try this:

df = pd.concat([df1, df2]).groupby(['dt']).sum().reset_index()

print(df)

PS: This is ensure all datetimes to exist.

Sign up to request clarification or add additional context in comments.

2 Comments

also, how would I divide one data by another keeping all the time rows
I request you to kindly ask a separate question as it will involve invalid division operation if the values are missing in either of the data frame.
2

One way would be to first merge the dataframes, and then carry out the sum.

df1 = df1.merge(df2, left_on='dt', right_on='dt', how='outer')

The column names may end with an '_x' or '_y' to allow you to differentiate. You can then do a normal dataframe sum.

3 Comments

I tried this one but it remove row with 2021-03-19 time. And I want to keep it as well
In that case, I think you can just set how='outer'. This will take the union of both keys, so they won't be removed.
the dataset doesn't have 2021-03-19 00:00:00+00:00 1 2021-03-20 00:00:00+00:00 2 2021-03-21 00:00:00+00:00

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.