I'm giving pandas a int like this: 01142021223007, the format is '%m%d%Y%H%M%S'. this worked perfectly in 2020. For example:
12192020032906 -> 2020-12-19 03:29:06
Since 2021 it is giving the wrong date:
01142021223007 -> 2021-11-04 22:30:07
Should be 2021-01-14 22:30:07
Code:
self.df['time'] = pd.to_datetime(self.df['time'], format='%m%d%Y%H%M%S', errors = 'coerce')
I am assuming it just skips the 0 at the beginning of 01142020 and therefore gets to 11 4 2020. is there a way to explicitly say MMDDYYYY? format ='%mm%dd%YYYY%HH%MM%SS' does not work.
the CSV file I am reading from:
hum,moist,temp,time
81.1,40,26.30,12192020032906
83.1,38,25.80,12192020033006
85.6,39,25.30,12192020033106
87.3,38,24.90,12192020033206
89.4,38,24.50,12192020033306
90.2,38,24.20,12192020033407
90.9,39,23.90,12192020033506
91.5,38,23.70,12192020033607
92.2,38,23.40,12192020033706
...
57.0,15,25.60,01142021095906
53.6,47,24.30,01142021222407
53.7,44,24.30,01142021222419
54.1,45,24.30,01142021222540
54.9,43,24.30,01142021222706
55.2,43,24.20,01142021222806
55.5,44,24.20,01142021222906
55.7,43,24.20,01142021223007
The resulting pandas df:
hum moist temp time
0 44.605 40 25.3 2020-12-19 03:29:06
1 45.705 38 24.8 2020-12-19 03:30:06
2 47.080 39 24.3 2020-12-19 03:31:06
3 48.015 38 23.9 2020-12-19 03:32:06
4 49.170 38 23.5 2020-12-19 03:33:06
... ... ... ... ...
22387 29.755 45 23.3 2021-11-04 22:25:40
22388 30.195 43 23.3 2021-11-04 22:27:06
22389 30.360 43 23.2 2021-11-04 22:28:06
22390 30.525 44 23.2 2021-11-04 22:29:06
22391 30.635 43 23.2 2021-11-04 22:30:07
dtype={'time': 'str'}and that should solve your problem. Without that pandas tries to be smart and will cast that column to int because they are all numeric-like values.