4

Problem

df1 = pd.DataFrame({"A":[12, 4, 5, 44, 1], 
                   "B":[5, 2, 54, 3, 2] }) 

df2 = pd.DataFrame({"A":[20, 16, 7, 3, 8], 
                   "B":[14, 3, 17, 2, 6]})

Given two dataframes, df1 and df2, I want to create a third dataframe, df3 which contains the maximum values of the two dataframes.

df3 =pd.DataFrame({"A":[20, 16, 7, 44, 8], 
                   "B":[14, 3, 54, 3, 6]}) 

Attempt

I created two temp dataframes with both columns of A from df1 and df2 into a numpy array, then found each maximum value. The same process was repeated for B. I then combined both arrays for A and B to get df3. However, I feel this is not elegant and I want a more efficient method for accomplishing this task.

Thank you @beny @Romero_91 @cameron-riddell ... your solutions are very elegant and simpler than mine! I knew I was missing something!!

3 Answers 3

3

You can use the dataframe method combine to perform an elementwise maximum:

import numpy as np

df3 = df1.combine(df2, np.maximum)
print(df3)
    A   B
0  20  14
1  16   3
2   7  54
3  44   3
4   8   6

As pointed out by @anky, np.maximum on its own performs this elemnt-wise comparison. It's always good to remember those pure numpy solutions especially when they lead to such clean & simple code.

df3 = np.maximum(df1, df2)
print(df3)
    A   B
0  20  14
1  16   3
2   7  54
3  44   3
Sign up to request clarification or add additional context in comments.

Comments

2

Let us do

out = df2.mask(df1>df2,df1)
Out[141]: 
    A   B
0  20  14
1  16   3
2   7  54
3  44   3
4   8   6

Comments

1

You can use the where method of numpy as:

import numpy as np

df3 = pd.DataFrame(np.where(df1>df2, df1, df2), columns=['A', 'B'])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.