2

I have a csv output of thermal simulations, which I would like to perform data analysis on, using pandas.

Having imported the csv into a pandas dataframe, I would like to manipulate the timestamp and import it as such.

The starting format is the following (starts with a space, has US data format and year is missing).

    ' 05/01  01:00:00'
    ' 05/01  02:00:00'
    ' 05/01  03:00:00'
    ' 05/01  04:00:00'
    ' 05/01  05:00:00'

I was adviced to address it with a loop function, which I wrote as follows:

timestamp = []
for ns in raw_datetime:
    #timestamp.append(ns[5:7] + '.' + ns[2:4] + '_' + ns[9:11] + '00h')
    timestamp.append('2016' + '/' + ns[2:4] + '/' + ns[5:7] + '_' + ns[9:11] + ':00')

where

raw_datetime = df[' Date/Time']  #original data column

This works fine and returns the datetime format I want.

['2016/05/01_01:00', '2016/05/01_02:00', '2016/05/01_03:00', '2016/05/01_04:00']

However this appears not be usable by pd.to_datetime function, as that seems to require a list rather than a series(?).

I came across the concept of parsing and and fuctions wuch as:

 raw_datetime.str.extract('string', expand=True)

however, I am not sure how I could do that, while flipping month and day AND adding year 2016 info which is not present in the raw data.

Thanks!

Edit: code added below N.B. native format is ' 05/01 01:00:00' i.e.: double space, month, day,double space, hh, mm, ss)]

first attempt

df = pd.read_csv('./SimResults.csv')
a = pd.to_datetime(df[' Date/Time'], format='  %m/%d  %H:%M:%s')

Second attempt:

df = pd.read_csv('./SimResults.csv')
raw_datetime = df[' Date/Time'].str.lstrip('  ')
raw_datetime = ('2016/') + raw_datetime   
b = pd.to_datetime(raw_datetime, format='%Y/%m/%d  %H:%M:%S')
3
  • it should work on a series also pd.to_datetime(df['your_new_col']) should work Commented Jun 20, 2016 at 12:40
  • when reading the csv file use parse_dates=['Date/Time'], then it will automatically append 2016 to the date. Commented Jun 20, 2016 at 13:13
  • sorry guys, but none of these seem to work! Commented Jun 20, 2016 at 13:55

1 Answer 1

2

You should specify your format to to_datetime function, because it isn't default format:

pd.to_datetime(x, format='%Y/%m/%d_%H:%M')
Sign up to request clarification or add additional context in comments.

4 Comments

I have specified the format from my 'to_datetime' function, however I receive the error: "time data ' 05/01 01:00:00' does not match format '%Y/%m/%d_%H:%M' The string needs to be manipulated to be read by the 'to_datetime' command, doesn't it?
No, no need any additional manipulations. Maybe some mistype in format? You can check options here: docs.python.org/2/library/….
Eugene, it does not seem to work, I am afraid. I have tried so many options and I dont know what else to do. Would you be happy to check my code?
I have added two bits of code to the question. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.