2

I have Date and timestamps in two different rows in Pandas DataFrame. Any pointer how to parse date and time together in one row which can be further used for time-series analysis further?

e.g.

Row 1 Date            2017-12-11 00:00:00   2017-12-11 00:00:00     2017-12-11 00:00:00     2017-12-11 00:00:00 

Row 2 Timestamp             01:00:00              02:00:00                03:00:00              04:00:00 

and then some more rows having more data

Can Row 1 and Row 2 be combined together to have complete date/timestamp information together?

I was thinking of applying Transpose and then using parse_dates on columns. Is there any other direct way of doing that in python?

2 Answers 2

1

I think best is transpose DataFrame first for columns from rows for same dtypes per columns:

df = df.T

And then convert column Date by to_datetime and add Time converted to_timedelta:

df['dates'] = pd.to_datetime(df['Date']) + pd.to_timedelta(df['Time'])
Sign up to request clarification or add additional context in comments.

4 Comments

the only problem is, that there is more data in other rows. Transpose is a good idea if the dataframe holds only dates and timestamps.
@Matthias - Yes, it depends of data.
pd.to_timedelta(df['Time']) - when I am applying this to my dataframe after transpose, it is throwing below errors: any suggestions to get rid of this? TypeError: object of type 'datetime.time' has no len() During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last)
@Mandy - How working pd.to_datetime(df['Date']) + pd.to_timedelta(df['Time'].astype(str)) ?
0

If you are working with time series data, I strongly suggest you make the datetime component your index. You may find operations more efficient when your dataframe has an index containing non-duplicated values.

Your idea of transposing the dataframe first is good. Here's a minimal example:

df = pd.DataFrame([['2017-12-11 00:00:00', '2017-12-11 00:00:00',
                    '2017-12-11 00:00:00', '2017-12-11 00:00:00'],
                   ['01:00:00', '02:00:00', '03:00:00', '04:00:00'],
                   [1, 2, 3, 4], [5, 6, 7, 8]],
                  index=['Date', 'Timestamp', 'Data1', 'Data2'])

df = df.T
df.index = pd.to_datetime(df.pop('Date')) + pd.to_timedelta(df.pop('Timestamp'))

Resulting dataframe:

print(df)

                    Data1 Data2
2017-12-11 01:00:00     1     5
2017-12-11 02:00:00     2     6
2017-12-11 03:00:00     3     7
2017-12-11 04:00:00     4     8

You now have a DatetimeIndex:

print(df.index)

DatetimeIndex(['2017-12-11 01:00:00', '2017-12-11 02:00:00',
               '2017-12-11 03:00:00', '2017-12-11 04:00:00'],
              dtype='datetime64[ns]', freq=None)

2 Comments

pd.to_timedelta works fine with your sample code but when I am using it with my dataframe data after transposing it is throwing TypeError: TypeError: object of type 'datetime.time' has no len() During handling of the above exception, another exception occurred: any idea how to avoid this error?
pd.to_timedelta(df['Time'].astype(str)) works for me

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.