2

I would like to add the two dataframes together as column 1 is added to column 1 (as in matrix summation based on i, j), column 2 is added to column 2 in case that the column does not exist in one of the dataframes, they should still appended from one of the dataframes.

The output should be a dataframe consisting an shown index of: ['Sun', 'Wind', 'Water', 'Flow'] then the dataframe should be ranging from 1:22.

All values are currently 0, but if column "2", cell 3 in dt1 is 200, then this cell is added to column "2" cell 3 in dt2 which is 10 for the total of 210.

import pandas as pd 
cols = range(1, 20)
idx = ['Sun', 'Wind', 'Water', 'Flow']
rows = [0] * int(len(cols))
rows = [rows]

dt1 = pd.DataFrame(rows, index=idx, columns=cols)
dt1 = dt1.reset_index()

cols = range(3, 22)
idx = ['Sun', 'Wind', 'Water', 'Flow']
rows = [0] * int(len(cols))
rows = [rows]

dt2 = pd.DataFrame(rows, index=idx, columns=cols)
dt2 = dt2.reset_index()


TRIED: 
df = dt1[dt1.columns[1:]].add(dt2[dt2.columns[1:]]).fillna(0)

It may be that matrix addition is the way forward with two for loops, however, I'm not quite sure how to handle the comparison of appending the right values in the right columns.

1
  • There's no need whatsoever to do dt.reset_index(), pandas can add dataframes which have an index, and also you wouldn't need to slice [1:] to avoid the index. so keep the index as-is. Commented Sep 20, 2021 at 19:34

4 Answers 4

1

You can get the union of columns by Index.union(), reindex by .reindex() with fill value 0. Then .add() the 2 dataframes and .reset_index(), as follows:

dt1a = dt1.set_index('index')
dt2a = dt2.set_index('index')
all_cols = dt1a.columns.union(dt2a.columns)

dt1b = dt1a.reindex(all_cols, axis=1, fill_value=0)
dt2b = dt2a.reindex(all_cols, axis=1, fill_value=0)

df_out = dt1b.add(dt2b).reset_index()

Data Input

dt1.at[2, 3] = 200

print(dt1)

   index  1  2    3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19
0    Sun  0  0    0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0
1   Wind  0  0    0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0
2  Water  0  0  200  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0
3   Flow  0  0    0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0

dt2.at[2, 3] = 10

print(dt2)

   index   3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21
0    Sun   0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0
1   Wind   0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0
2  Water  10  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0
3   Flow   0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0

Output

print(df_out)


   index  1  2    3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21
0    Sun  0  0    0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0
1   Wind  0  0    0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0
2  Water  0  0  210  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0
3   Flow  0  0    0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. This was exactly what I sought for.
0

I think you could reindex both df:s like this

dt1 = dt1.reindex(range(1,22))

dt2 = dt2.reindex(range(1,22))

dt3 = dt1 + dt2

Comments

0

If your columns and rows are aligned between the two dataframes:

>>> dt1.iloc[:, 1:].add(dt2.iloc[:, 1:].values)

Or don't reset_index:

>>> dt1 + dt2

Comments

0

Using difference and intersection, you could add the unknown columns from dt2 into dt1 and then sum those columns in common. The assumption here is that you want row-wise addition (that is, each dataset has rows in common), so reset_index is not needed.

import pandas as pd
cols = range(1, 20)
idx = ['Sun', 'Wind', 'Water', 'Flow']
rows = [0] * int(len(cols))
rows = [rows]

dt1 = pd.DataFrame(rows, index=idx, columns=cols)

cols = range(3, 22)
idx = ['Sun', 'Wind', 'Water', 'Flow']
rows = [0] * int(len(cols))
rows = [rows]

dt2 = pd.DataFrame(rows, index=idx, columns=cols)

# Insert new columns from dt2 into dt1 then add common columns
common_columns = dt1.columns.intersection(dt2.columns)
new_columns = dt2.columns.difference(dt1.columns)
dt1[new_columns] = dt2[new_columns]
dt1[common_columns] += dt2[common_columns]
del dt2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.