0

I have a pandas dataframe as below:

import pandas as pd
import numpy as np
df = pd.DataFrame({'ORDER':["A", "A", "A", "A", "B","B"], 'A':[80, 23, np.nan, 60, 1,22], 'B': [80, 55, 5, 76, 67,np.nan]})
df

       ORDER    A        B
0       A       80.0    80.0
1       A       23.0    55.0
2       A       NaN     5.0
3       A       60.0    76.0
4       B       1.0     67.0
5       B       22.0    NaN

I want to create a column "new" as below: If ORDER == 'A', then new=df['A'] If ORDER == 'B', then new=df['B']

This can be achieved using the below code:

df['new'] = np.where(df['ORDER'] == 'A', df['A'],  np.nan)
df['new'] = np.where(df['ORDER'] == 'B', df['B'],  df['new'])

The tweak here is if ORDER doesnot have the value "B", Then B will not be present in the dataframe.So the dataframe might look like below. And if we use the above code o this dataframe, it will give an error because column "B" is missing from this dataframe.

       ORDER    A    
0       A       80.0
1       A       23.0
2       A       NaN  
3       A       60.0
4       A       1.0  
5       A       22.0

1 Answer 1

1

Use DataFrame.lookup, so you dont need to hardcode df['B'], but it looksup the column value:

df['new'] = df.lookup(df.index, df['ORDER'])

  ORDER     A     B   new
0     A  80.0  80.0  80.0
1     A  23.0  55.0  23.0
2     A   NaN   5.0   NaN
3     A  60.0  76.0  60.0
4     B   1.0  67.0  67.0
5     B  22.0   NaN   NaN
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.