I have a csv output of thermal simulations, which I would like to perform data analysis on, using pandas.
Having imported the csv into a pandas dataframe, I would like to manipulate the timestamp and import it as such.
The starting format is the following (starts with a space, has US data format and year is missing).
' 05/01 01:00:00'
' 05/01 02:00:00'
' 05/01 03:00:00'
' 05/01 04:00:00'
' 05/01 05:00:00'
I was adviced to address it with a loop function, which I wrote as follows:
timestamp = []
for ns in raw_datetime:
#timestamp.append(ns[5:7] + '.' + ns[2:4] + '_' + ns[9:11] + '00h')
timestamp.append('2016' + '/' + ns[2:4] + '/' + ns[5:7] + '_' + ns[9:11] + ':00')
where
raw_datetime = df[' Date/Time'] #original data column
This works fine and returns the datetime format I want.
['2016/05/01_01:00', '2016/05/01_02:00', '2016/05/01_03:00', '2016/05/01_04:00']
However this appears not be usable by pd.to_datetime function, as that seems to require a list rather than a series(?).
I came across the concept of parsing and and fuctions wuch as:
raw_datetime.str.extract('string', expand=True)
however, I am not sure how I could do that, while flipping month and day AND adding year 2016 info which is not present in the raw data.
Thanks!
Edit: code added below N.B. native format is ' 05/01 01:00:00' i.e.: double space, month, day,double space, hh, mm, ss)]
first attempt
df = pd.read_csv('./SimResults.csv')
a = pd.to_datetime(df[' Date/Time'], format=' %m/%d %H:%M:%s')
Second attempt:
df = pd.read_csv('./SimResults.csv')
raw_datetime = df[' Date/Time'].str.lstrip(' ')
raw_datetime = ('2016/') + raw_datetime
b = pd.to_datetime(raw_datetime, format='%Y/%m/%d %H:%M:%S')
pd.to_datetime(df['your_new_col'])should workparse_dates=['Date/Time'], then it will automatically append2016to the date.