1

Imagine I have two pandas data frame as:

import pandas as pd

df1 = {'y1': [1, 2, 3, 4]}
df2 = {'y2': [3, 1, 2, 6]}

What I want is if a value in y2 is greater than the value in y1, I want to set df2['y2'] to the corresponding df['y1']. When I try selecting the corresponding columns like:

df2[df2['y2'] > df1['y1']]

This is returns True rather than the index. I was hoping to do something like:

df2[df2['y2'] > df1['y1']]['y2'] = df1['y1'] 

3 Answers 3

2

Use numpy.where:

In [233]: import numpy as np

In [234]: df1 = pd.DataFrame({'y1': [1, 2, 3, 4]})
In [236]: df2 = pd.DataFrame({'y2': [3, 1, 2, 6]})

In [242]: df2['y2'] = np.where(df2.y2.gt(df1.y1), df1.y1, df2.y2)

In [243]: df2
Out[243]: 
   y2
0   1
1   1
2   2
3   4
Sign up to request clarification or add additional context in comments.

Comments

2

If same index in both DataFrames:

Use DataFrame.loc:

df2.loc[df2['y2'] > df1['y1'], 'y2'] = df1['y1'] 
print (df2)
   y2
0   1
1   1
2   2
3   4

OrSeries.where, Series.mask:

df2['y2'] = df1['y1'].where(df2['y2'].gt(df1['y1']), df2['y2'])
df2['y2'] = df2['y2'].mask(df2['y2'].gt(df1['y1']), df1['y1'])
print (df2)
   y2
0   1
1   1
2   2
3   4

Comments

2

np.minimum

Maintain all of existing df2 but with updated column values in 'y2'

df2.assign(y2=np.minimum(df1.y1, df2.y2))

   y2
0   1
1   1
2   2
3   4

Or just a new dataframe with one column

pd.DataFrame({'y2': np.minimum(df1.y1, df2.y2)})

   y2
0   1
1   1
2   2
3   4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.