Hello I am trying to change my dataframe dates into a format i can use to extract useful information. The dataset comes with a 'week' feature that is in the form DD/MM/YY as follows:
In [128]: df_train[['week', 'units_sold']]
Out[128]:
week units_sold
0 17/01/11 20
1 17/01/11 28
2 17/01/11 19
3 17/01/11 44
4 17/01/11 52
I have changed the dates as follows:
df_train['new_date'] = pd.to_datetime(df_train['week'])
new_date units_sold
0 2011-01-17 20.0
1 2011-01-17 28.0
2 2011-01-17 19.0
3 2011-01-17 44.0
4 2011-01-17 52.0
Using the 'new_date' feature I created, I did the following for some information extraction:
df_train['weekday'] = df_train['new_date'].dt.weekofyear #week day of the year
df_train['QTR'] = df_train['new_date'].apply(lambda x: x.quarter) #current quarter of the year
df_train['month'] = df_train['new_date'].apply(lambda x: x.month) #current month
df_train['year'] = df_train['new_date'].dt.year #current year
However, when reviewing my data I run into some errors. For example a certain date in my dataset is 07/02/11 which should translate to a month of 2. except my parsing shows that the month is 7, which I know is incorrect: see entry 3483
Out[127]:
week month
18 17/01/11 1
1173 24/01/11 1
2328 31/01/11 1
3483 07/02/11 7
4638 14/02/11 2
Can anyone tell me where i went wrong? Any help is apprecaited!