1

How can I parse the content of this Python DataFrame into a new column that contains the existing columns as one datetime object?

I would like to avoid a for loop (and if possible also a lambda) for time performance reasons.

import pandas as pd
df = pd.DataFrame({"century": [20, 20, 20], "my_date": [180105, 180106, 180107],
                   "my_time": ["17:01", "17:02", "17:03"]})

.

       century  my_date my_time
    0       20   180105   17:01
    1       20   180106   17:02
    2       20   180107   17:03

2 Answers 2

2

Use to_datetime with joined all columns and format by http://strftime.org/ for improve performance:

df['date'] = pd.to_datetime(df['century'].astype(str) + 
                            df['my_date'].astype(str) + 
                            df['my_time'], format='%Y%m%d%H:%M')

print (df)
   century  my_date my_time                date
0       20   180105   17:01 2018-01-05 17:01:00
1       20   180106   17:02 2018-01-06 17:02:00
2       20   180107   17:03 2018-01-07 17:03:00
Sign up to request clarification or add additional context in comments.

Comments

1

You can create a series of strings in a Pandas-friendly format. Then feed to pd.to_datetime.

df = pd.DataFrame({"century": [20, 20, 20], "my_date": [180105, 180106, 180107],
                   "my_time": ["17:01", "17:02", "17:03"]})

date_str = (df['century']*10**6 + df['my_date']).astype(str) + ' ' + df['my_time']

df['dt'] = pd.to_datetime(date_str)

print(df)

   century  my_date my_time                  dt
0       20   180105   17:01 2018-01-05 17:01:00
1       20   180106   17:02 2018-01-06 17:02:00
2       20   180107   17:03 2018-01-07 17:03:00

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.