1

I have a csv file with Timestamp column below. I want to change the format to 2013-08-12 10:29:19.673 or granularity of one second. Currently Timestamp is type object.

I could change its format in excel manually but the file is too big and some rows would be lost.

        Id          Timestamp Data  Group_Id
0       19929927    00:07.5   27.0  27
1       19929928    00:08.3   26.5  27
2       19929929    00:48.7   33.5  157
3       19929930    00:50.0   33.0  157
4       19929931    00:53.1   35.0  25

                 ...

1048570 20978497    10:11.9   34.5  152
1048571 20978498    10:13.3   34.0  152
1048572 20978499    10:41.2   42.0  138
1048573 20978500    10:42.5   45.0  138
1048574 20978501    10:43.9   44.0  138

1
  • @jezrael do you know how to do this? Thanks! Commented Oct 25, 2019 at 5:04

1 Answer 1

1

EDIT: If convert times to datetimes with no information about dates, pandas obviously add date of actual day.

If need some another days, check this solution:

Idea is create consecutive datetimes cganged if times starts with 0:

df = df[['Timestamp']]
print (df)
   Timestamp
0    00:08.3 <- first day
1    00:48.7
2    00:50.0
3    00:53.1
4    10:11.9
5    10:13.3
6    10:41.2
7    00:50.0 <- second day
8    00:53.1
9    10:42.5
10   10:43.9
11   00:07.5 <- third day
12   00:08.3
13   10:11.9
14   10:13.3
15   10:43.9

#convert to datetimes and get hours for test 0
df['h'] = pd.to_datetime(df['Timestamp']).dt.hour
#test first 0 for start of day
df['mask'] = df['h'].shift().ne(0) & df['h'].eq(0)
#create consecutive groups - starts by 1 if first time start by 0, else start by 1 
df['g'] = df['mask'].cumsum()
#specify first day in origin parameter
df['days'] = pd.to_datetime(df['g'], origin='2016-01-01', unit='d')
#add to original Timestamps if HH:MM.SS
df['Timestamp1'] = df['days'] + pd.to_timedelta(df['Timestamp'].str.replace('\.',':'))
#add to original Timestamps if format without hours - MM:SS.SS
df['Timestamp2'] = df['days'] + pd.to_timedelta('00:' + df['Timestamp'])

print (df)
   Timestamp   h   mask  g       days          Timestamp1  \
0    00:08.3   0   True  1 2016-01-02 2016-01-02 00:08:03   
1    00:48.7   0  False  1 2016-01-02 2016-01-02 00:48:07   
2    00:50.0   0  False  1 2016-01-02 2016-01-02 00:50:00   
3    00:53.1   0  False  1 2016-01-02 2016-01-02 00:53:01   
4    10:11.9  10  False  1 2016-01-02 2016-01-02 10:11:09   
5    10:13.3  10  False  1 2016-01-02 2016-01-02 10:13:03   
6    10:41.2  10  False  1 2016-01-02 2016-01-02 10:41:02   
7    00:50.0   0   True  2 2016-01-03 2016-01-03 00:50:00   
8    00:53.1   0  False  2 2016-01-03 2016-01-03 00:53:01   
9    10:42.5  10  False  2 2016-01-03 2016-01-03 10:42:05   
10   10:43.9  10  False  2 2016-01-03 2016-01-03 10:43:09   
11   00:07.5   0   True  3 2016-01-04 2016-01-04 00:07:05   
12   00:08.3   0  False  3 2016-01-04 2016-01-04 00:08:03   
13   10:11.9  10  False  3 2016-01-04 2016-01-04 10:11:09   
14   10:13.3  10  False  3 2016-01-04 2016-01-04 10:13:03   
15   10:43.9  10  False  3 2016-01-04 2016-01-04 10:43:09   

                Timestamp2  
0  2016-01-02 00:00:08.300  
1  2016-01-02 00:00:48.700  
2  2016-01-02 00:00:50.000  
3  2016-01-02 00:00:53.100  
4  2016-01-02 00:10:11.900  
5  2016-01-02 00:10:13.300  
6  2016-01-02 00:10:41.200  
7  2016-01-03 00:00:50.000  
8  2016-01-03 00:00:53.100  
9  2016-01-03 00:10:42.500  
10 2016-01-03 00:10:43.900  
11 2016-01-04 00:00:07.500  
12 2016-01-04 00:00:08.300  
13 2016-01-04 00:10:11.900  
14 2016-01-04 00:10:13.300  
15 2016-01-04 00:10:43.900  
Sign up to request clarification or add additional context in comments.

11 Comments

I just realised that the interpreted dates are wrong- it's not 2019, might be 2016/2017
@nilsinelabore - What is logic for set 2016 and 2017 ?
@nilsinelabore - Answer was edited, please check it.
thank you. Sorry I must have forgotten to mention that this csv file is exported from sql server which looks like date in this link: stackoverflow.com/questions/18598075/… I am not sure how exactly it is done but I think there was a particular way that sql/excel interprets the dates. For example, 00:07.5, 00:08.3, 00:48.7 = 1/12/2015 12:00:07 am, 1/12/2015 12:00:08 am, 1/12/2015 12:00:49 am
I got it sorted out in python. Thank you:)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.