I am trying to join two pandas data frames with an inner join.
my_df = pd.merge(df1, df2, how = 'inner', left_on = ['date'], right_on = ['myDate'])
However I am getting the following error:
KeyError: 'myDate' TypeError: an integer is required
I believe joining on dates is valid, however I cannot make this simple join work?
DF2 was created using the following
df2 = idf.groupby(lambda x: (x.year,x.month,x.day)).mean()
Can someone please advise? Thanks a lot.
df1
type object
id object
date object
value float64
type id date value
0 CAR PSTAT001 15/07/15 42
1 BIKE PSTAT001 16/07/15 42
2 BIKE PSTAT001 17/07/15 42
3 BIKE PSTAT004 18/07/15 42
4 BIKE PSTAT001 19/07/15 32
df2
myDate object
val1 float64
val2 float64
val3 float64
myDate val1 val2 val3
0 (2015,7,13) 1074 1871.666667 2800.777778
1 (2015,7,14) 347.958333 809.416667 1308.458333
2 (2015,7,15) 202.625 597.375 1008.666667
3 (2015,7,16) 494.958333 1192 1886.916667
DF1.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3040 entries, 0 to 3039
Data columns (total 4 columns):
type 3040 non-null object
id 3040 non-null object
date 3040 non-null object
value 3040 non-null float64
dtypes: float64(1), object(3)
memory usage: 118.8+ KB
DF2.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 16 entries, 0 to 15
Data columns (total 4 columns):
myDate 16 non-null object
val1 16 non-null float64
val2 16 non-null float64
val3 16 non-null float64
dtypes: float64(3), object(1)
memory usage: 640.0+ bytes
df2['myDate']looks like a tuple with ints, can you post the output fromdf1.info()anddf2.info()