2

I tried to map these two dataframes but I failed. Maybe because the column names and it's value is a little bit different.

I wanted to create a new dataframe like in the dfNew.

df

Employee ID Employee Name   Activity Month
A0001       John Smith      Apr-19
A0002       Will Cornor     Apr-19
A0001       John Smith      May-19
A0003       David Teo       May-19
A0001       John Smith      May-19
A0002       Will Cornor     Jun-19
A0001       John Smith      Jun-19

df2

Month       Bonus
2019-04-01  5000
2019-05-01  4000
2019-06-01  6000

dfNew

Employee ID Employee Name   Activity Month  Bonus
A0001       John Smith      Apr-19          5000
A0002       Will Cornor     Apr-19          5000
A0001       John Smith      May-19          4000
A0003       David Teo       May-19          4000
A0001       John Smith      May-19          4000
A0002       Will Cornor     Jun-19          6000
A0001       John Smith      Jun-19          6000

2 Answers 2

4

Use Series.dt.strftime fr change format of datetimes, so possible Series.map:

s = df2.set_index(df2['Month'].dt.strftime('%b-%y'))['Bonus']
df1['Bonus'] = df1['Activity Month'].map(s)
print (df1)
  Employee     ID Employee Name Activity Month  Bonus
0    A0001   John         Smith         Apr-19   5000
1    A0002   Will        Cornor         Apr-19   5000
2    A0001   John         Smith         May-19   4000
3    A0003  David           Teo         May-19   4000
4    A0001   John         Smith         May-19   4000
5    A0002   Will        Cornor         Jun-19   6000
6    A0001   John         Smith         Jun-19   6000

Or use DataFrame.merge with DataFrame.pop for new column with remove original:

df2['Activity Month'] = df2.pop('Month').dt.strftime('%b-%y')
df1 = df1.merge(df2, on='Activity Month', how='left')
Sign up to request clarification or add additional context in comments.

4 Comments

I got an error about the datatype. Can u help me fix it? ... AttributeError: Can only use .dt accessor with datetimelike values
I tried to convert it to datetime but then this one came out ---- ValueError: time data 'Apr-19' does not match format '%d%b%Y:%H:%M:%S.%f' (match)
@FirdhausSaleh - so there is same format Apr-19 in both DataFrames? Then remove .dt.strftime('%b-%y') in both solutions.
Or convert both datetimes to same format like df1['Activity Month'] = pd.to_datetime(df1['Activity Month'], format='%b-%y').dt.strftime('%b-%y') and df2['Month'] = pd.to_datetime(df2['Month'], format='%b-%y').dt.strftime('%b-%y')
0

This is the answer for now based on @jezrael suggestion

df1['Activity Month'] = pd.to_datetime(df1['Activity Month'], format='%b-%y').dt.strftime('%b-%y')

df2['Month'] = pd.to_datetime(df2['Month'], format='%Y-%m-%d').dt.strftime('%b-%y')

df2['Activity Month'] = df2.pop('Month')
df1 = df1.merge(df2, on='Activity Month', how='left')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.