I have two data frames one with a userId, and gender and another data frame that has online activities of these users.
First Data Frame (df1)
userId, gender
001, F
002, M
003, F
004, M
005, M
006, M
Second data frame (df2)
userId, itemClicked, ItemBought, date
001, 123182, 123212, 02/02/2016
003, 234256, 123182, 05/02/2016
005, 986834, 234256, 04/19/2016
004, 787663, 787663, 05/12/2016
020, 465738, 465738, 03/20/2016
004, 787223, 787663, 07/12/2016
I want to add gender column to the second data frame by looking up the first data frame based on the userId. df2 might have multiple rows per user since its a click data where same user may have click multiple items.
This is very easy to do in MySql but I am trying to figure out to do it in pandas.
for index, row in df2.iterrows():
user_id = row['userId']
if user_id in df1['userId']:
t = df1.loc[df1['userId'] == user_id]
pdb.set_trace()
Is this the pandas way to so such a task?