I have a column in a Pandas Dataframe, called 'Excel_Date'. This column data looks like this:
Excel_Date
Before Q1 2018
Before Q1 2014
Before Q4 2018
42457
42457
42520
nan
nan
The column's dtype('O').
I have no idea how I can get this in a proper way.
Desired Output
Excel_Date
Before Q1 2018 #Or even better: the first month and day of Q1 (1/1/2018)
Before Q1 2014 #Or even better: the first month and day of Q1 (1/1/2014)
Before Q4 2018 #Or even better: the first month and day of Q4 (10/1/2018)
3/28/2016
3/28/2016
5/30/2017
nan
nan
The '#Or even better:... ' in the example would amazing! But I can understand that that could be a bit difficult.
What have I tried?
I tried to divide the problem, into smaller sub problems:
1. Create a column, with only the numeric values
> df['Excel_Date2'] = df['Excel_Date'].str.extract("(\d*\.?\d+)", expand=True
2. After that, I tried to deal with the numbers. But I failed.
>import datetime as dt
>import pandas as pd
>pd.TimedeltaIndex(df['Excel_Date2'], unit='d') + dt.datetime(1899, 12, 30)
Many, many thanks in advance!