1

I want to map one column of dataframe to another dataframe by using multiple columns. The sample dataframes are as follow:

df1 = pd.DataFrame()
df1['Date'] = ['2018-08-10','2018-08-10','2018-08-10','2018-08-10','2018-08-10',
               '2018-08-11','2018-08-11','2018-08-11','2018-08-12','2018-08-12',
               '2018-08-13','2018-08-12','2018-08-12','2018-08-12','2018-08-12']
df1['Name'] = ['A','A','A','A','A',
              'B','B','B','C','C',
              'D','D','E','F','F']
df1 ['Orgz'] = ['Alpha','Alpha','Alpha','Alpha','Alpha',
               'Beta','Beta','Beta','Gamma','Gamma',
               'Theta','Theta','Theta','Theta','Theta']
df2 = pd.DataFrame()
df2['Date'] = ['2018-08-10','2018-08-11','2018-08-12','2018-08-12','2018-08-13','2018-08-12','2018-08-12']
df2['Name'] = ['A','B','C','D','D','E','F']
df2 ['Orgz']= ['Alpha','Beta','Gamma','Theta','Theta','Theta','Theta']
df2 ['min'] = [60,85,90,77,25,20,30]

I want to map the column df2.min to df1 by creating a new column. However, I have three common columns which are Date, Name and Orgz to Map. In this case, I cannot just use Name to map it as there will be the same name at different date in my actual database. So I decided to use three columns (Date,Orgz and Name) to map with df2. The result will be assinged to a new column in df1. My actual result will look like as follow:

enter image description here

From the results, you will notice that some of the name are multiple at the same date. in that case, I just want to map the min value on the first row, the rest of the row are set to 0.

How can I do it?

Thanks, Zep

1 Answer 1

1

I believe you need merge with left join and then set 0 by numpy.where with boolean mask created by DataFrame.duplicated:

df3 = df1.merge(df2, on=['Date','Name','Orgz'], how='left')
df3['min'] = np.where(df3.duplicated(subset=['Date','Name','Orgz']), 0, df3['min'])
print (df3)
          Date Name   Orgz  min
0   2018-08-10    A  Alpha   60
1   2018-08-10    A  Alpha    0
2   2018-08-10    A  Alpha    0
3   2018-08-10    A  Alpha    0
4   2018-08-10    A  Alpha    0
5   2018-08-11    B   Beta   85
6   2018-08-11    B   Beta    0
7   2018-08-11    B   Beta    0
8   2018-08-12    C  Gamma   90
9   2018-08-12    C  Gamma    0
10  2018-08-13    D  Theta   77
11  2018-08-13    D  Theta    0
12  2018-08-12    E  Theta   25
13  2018-08-12    F  Theta   30
14  2018-08-12    F  Theta    0
Sign up to request clarification or add additional context in comments.

4 Comments

Hi Jez, Thanks a lot. Appreciated your help.
When I applied the same concept to my actual data, it showed key error
@Zephyr - keyerror means there are different columns names, check it.
Yeah checking now

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.