Parsing dates from two different rows in Pandas

Question

I have Date and timestamps in two different rows in Pandas DataFrame. Any pointer how to parse date and time together in one row which can be further used for time-series analysis further?

e.g.

Row 1 Date            2017-12-11 00:00:00   2017-12-11 00:00:00     2017-12-11 00:00:00     2017-12-11 00:00:00 

Row 2 Timestamp             01:00:00              02:00:00                03:00:00              04:00:00

and then some more rows having more data

Can Row 1 and Row 2 be combined together to have complete date/timestamp information together?

I was thinking of applying Transpose and then using parse_dates on columns. Is there any other direct way of doing that in python?

jezrael · Accepted Answer · 2018-08-02 10:28:52Z

1

I think best is transpose DataFrame first for columns from rows for same dtypes per columns:

df = df.T

And then convert column Date by to_datetime and add Time converted to_timedelta:

df['dates'] = pd.to_datetime(df['Date']) + pd.to_timedelta(df['Time'])

answered Aug 2, 2018 at 10:28

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Matthias Over a year ago

the only problem is, that there is more data in other rows. Transpose is a good idea if the dataframe holds only dates and timestamps.

jezrael Over a year ago

@Matthias - Yes, it depends of data.

Mandy Over a year ago

pd.to_timedelta(df['Time']) - when I am applying this to my dataframe after transpose, it is throwing below errors: any suggestions to get rid of this? TypeError: object of type 'datetime.time' has no len() During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last)

jezrael Over a year ago

@Mandy - How working pd.to_datetime(df['Date']) + pd.to_timedelta(df['Time'].astype(str)) ?

jpp · Accepted Answer · 2018-08-02 21:30:26Z

0

If you are working with time series data, I strongly suggest you make the datetime component your index. You may find operations more efficient when your dataframe has an index containing non-duplicated values.

Your idea of transposing the dataframe first is good. Here's a minimal example:

df = pd.DataFrame([['2017-12-11 00:00:00', '2017-12-11 00:00:00',
                    '2017-12-11 00:00:00', '2017-12-11 00:00:00'],
                   ['01:00:00', '02:00:00', '03:00:00', '04:00:00'],
                   [1, 2, 3, 4], [5, 6, 7, 8]],
                  index=['Date', 'Timestamp', 'Data1', 'Data2'])

df = df.T
df.index = pd.to_datetime(df.pop('Date')) + pd.to_timedelta(df.pop('Timestamp'))

Resulting dataframe:

print(df)

                    Data1 Data2
2017-12-11 01:00:00     1     5
2017-12-11 02:00:00     2     6
2017-12-11 03:00:00     3     7
2017-12-11 04:00:00     4     8

You now have a DatetimeIndex:

print(df.index)

DatetimeIndex(['2017-12-11 01:00:00', '2017-12-11 02:00:00',
               '2017-12-11 03:00:00', '2017-12-11 04:00:00'],
              dtype='datetime64[ns]', freq=None)

edited Aug 2, 2018 at 21:30

answered Aug 2, 2018 at 10:36

jpp

166k37 gold badges301 silver badges362 bronze badges

2 Comments

Mandy Over a year ago

pd.to_timedelta works fine with your sample code but when I am using it with my dataframe data after transposing it is throwing TypeError: TypeError: object of type 'datetime.time' has no len() During handling of the above exception, another exception occurred: any idea how to avoid this error?

Mandy Over a year ago

pd.to_timedelta(df['Time'].astype(str)) works for me

Collectives™ on Stack Overflow

Parsing dates from two different rows in Pandas

2 Answers 2

4 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related