0

I have a LONG list of dataframes (with many columns ...) like so:

DF1
    nameOne     total_one       total_two       nameTwo    nameThree    ...
0       nn1             2               3           nn4          nn7
1       nn2             2               3           nn5          nn8
2       nn3             2               3           nn6          nn9

DF2
    nameOne     total_one       total_two      nameTwo    nameThree.    ....
0       nn1             4               6          nn4          nn7
1       nn2             4               6          nn5          nn8
2       nn3             4               6          nn6          nn9

What I want the result to look like:

RESULT_DF
    nameOne     total_one       total_two   nameTwo    nameThree.   ....
0       nn1             6               9       nn4          nn7
1       nn2             6               9       nn5          nn8
2       nn3             6               9       nn6          nn9

In other words, I just want to add the number columns.

I tried doing this: DF1.add(DF2, fill_value=0)

But I get this error: TypeError: unsupported operand type(s) for +: 'int' and 'str'

For brevity, here I only showed one Str column. But I have many more STR columns, and many more int/float columns.

Explanation: Please note this is not the same as THIS question because I mentioned that I have multiple string fields.

5
  • Does this answer your question? Pandas sum multiple dataframes Commented Jul 16, 2020 at 19:28
  • That's what I tried above. It does not handle the case where you have strings since it does not know how to add them. Commented Jul 16, 2020 at 19:29
  • That's precisely what the second option does, you need to set nameOne as the index Commented Jul 16, 2020 at 19:31
  • I am not following. Commented Jul 16, 2020 at 19:33
  • cols = ['total_one', 'total_two']; df = df1.copy(); df[cols] = df1[cols] + df2[cols] Commented Jul 16, 2020 at 19:33

2 Answers 2

2

You can try using select_dtypes and pd.concat:

pd.concat([df1.select_dtypes(include='number').add(df2.select_dtypes(include='number')),
           df1.select_dtypes(exclude='number'), 
           df2.select_dtypes(exclude='number')], axis=1)

Output:

   total_one  total_two nameOne nameOne
0          6          9     nn1     nn1
1          6          9     nn2     nn2
2          6          9     nn3     nn3

Adding list of dataframes

df3 = df1.copy()
df4 = df2.copy()

lofdfs = [df1, df2, df3, df4]

from functools import reduce
df_add = reduce(lambda x,y: x.select_dtypes(include='number').add(y.select_dtypes('number')), lofdfs)
pd.concat([i.select_dtypes(exclude='number') for i in lofdfs]+[df_add], axis=1, keys=['df1','df2','df3','df4','df_add'])

Output:

      df1     df2     df3     df4    df_add          
  nameOne nameOne nameOne nameOne total_one total_two
0     nn1     nn1     nn1     nn1        12        18
1     nn2     nn2     nn2     nn2        12        18
2     nn3     nn3     nn3     nn3        12        18

Moving totals to the front:

df_add = reduce(lambda x,y: x.select_dtypes(include='number').add(y.select_dtypes('number')), lofdfs)
print(pd.concat([df_add]+[i.select_dtypes(exclude='number') for i in lofdfs], axis=1, keys=['df1','df2','df3','df4','df_add']))

Output:

        df1               df2     df3     df4  df_add
  total_one total_two nameOne nameOne nameOne nameOne
0        12        18     nn1     nn1     nn1     nn1
1        12        18     nn2     nn2     nn2     nn2
2        12        18     nn3     nn3     nn3     nn3
Sign up to request clarification or add additional context in comments.

6 Comments

Scott - how would i do this for a list of dataframes?
@DonCode Not sure if this is what you needed, but see update.
Ålmost -- i am mostly interested in the totals - not so much on the names. I would like to have them in the end result but not necessary.
@DonCode Just move the df_add in the pd.concat first.
Can we do it somehow without specifying the keys=['df1','df2','df3','df4','df_add'] ? My list has many elements so this would not work.
|
0

Another way use df.filter to filter out DF2's numeric columns. Add them to DF1. What results is aggregated results of numeric columns and NaNs for the str. Chain combine_first and this updates the NaNS

Option1; 2 dataframes

If have only two datframes

RESULT_DF=(DF2.filter(like='total').add(DF1)).combine_first(DF1)
RESULT_DF

   nameOne nameThree nameTwo  total_one  total_two
0     nn1       nn7     nn4          6          9
1     nn2       nn8     nn5          6          9
2     nn3       nn9     nn6          6          9

Option2 list of datframes

If you have lists; select non numeric dtypes and join to summation after using reduce in from functools

DF3 = DF2.copy()
DF4 = DF1.copy()
LST=[DF1,DF2,DF3]

from functools import reduce
df = DF1.select_dtypes(exclude='int64').join(reduce(lambda x, y:\
(x.filter(like='total')).add(y.filter(like='total'), fill_value=0), LST))
print(df)



  nameOne nameTwo nameThree  total_one  total_two
0     nn1     nn4       nn7         10         15
1     nn2     nn5       nn8         10         15
2     nn3     nn6       nn9         10         15

1 Comment

Does this help?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.