Python datetime formatted string to datetime object

Question

I have a series of CSVs with a column containing a Python datetime-formatted string. Whilst parsing the CSV files (which could be tens of thousands of rows long), I want the date column to be converted from a string to an actual datetime object.

An example CSV row:

['0', '(2011, 12, 11, 15, 45, 20)', 'Arduino/libraries/dallas-temperature-control/'],

As you can see, the date is represented in the CSV in datetime format, but as a string.

I am looking for a fast way to build the datetime object without resorting to running it through datetime.strptime(row[1], "(%Y, %m, %d, %H, %M, %S)") - it seems counter-intuitive to have to interpret the date with strptime when it's ready to drop in as-is.

You could ast.literal_eval the string to a tuple of integers, then unpack that straight into the datetime constructor: datetime.datetime(*literal_eval(row[1])) — jonrsharpe
– jonrsharpe, Commented Sep 7, 2015 at 9:09
Is this pandas? Also wouldn't make sense to store the str in a format that can be more easily parsed? — EdChum
– EdChum, Commented Sep 7, 2015 at 9:09
@EdChum - The CSV output comes straight from ZipFile.infolist(), so there's no intermediary interpreter at the moment. — Karl M.W.
– Karl M.W., Commented Sep 7, 2015 at 9:12
Perfect thanks @jonrsharpe. Do you want to add that as an answer so I can accept it? — Karl M.W.
– Karl M.W., Commented Sep 7, 2015 at 9:15

Community · Accepted Answer · 2017-05-23 10:26:34Z

4

You can use ast.literal_eval to convert the string to a tuple of integers:

>>> import ast
>>> ast.literal_eval('(2011, 12, 11, 15, 45, 20)')
(2011, 12, 11, 15, 45, 20)

You can then unpack this (see e.g. What does ** (double star) and * (star) do for parameters?) straight into the datetime constructor:

>>> import datetime
>>> datetime.datetime(*ast.literal_eval('(2011, 12, 11, 15, 45, 20)'))
datetime.datetime(2011, 12, 11, 15, 45, 20)

edited May 23, 2017 at 10:26

CommunityBot

11 silver badge

answered Sep 7, 2015 at 9:16

jonrsharpe

123k31 gold badges277 silver badges488 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Anand S Kumar · Accepted Answer · 2015-09-07 09:42:09Z

4

Like @jonrhsarpe has said in his answer, you can use ast.literal_eval to convert the string to a tuple and then unpack it into the string.

But based on the following tests, it seems like the faster method would still be to use datetime.datetime.strptime(). Example -

Code -

import datetime
import ast

def func1(datestring):
    return datetime.datetime(*ast.literal_eval(datestring))

def func2(datestring):
    return datetime.datetime.strptime(datestring, '(%Y, %m, %d, %H, %M, %S)')

Timing information -

In [39]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 30.1 µs per loop

In [40]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 26.9 µs per loop

In [41]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 38.6 µs per loop

In [42]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 28.8 µs per loop

In [43]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 31.2 µs per loop

In [44]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 29.5 µs per loop

In [45]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
The slowest run took 5.51 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 32.6 µs per loop

In [46]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
The slowest run took 15.42 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 27.5 µs per loop

In [47]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 49.2 µs per loop

In [48]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 24.4 µs per loop

Not sure, where you got the information that datetime.datetime.strptime() is counter-intuitive, but I would say for parsing strings to datetime objects, you should use strptime() .

edited Sep 7, 2015 at 9:42

answered Sep 7, 2015 at 9:24

Anand S Kumar

91.4k18 gold badges196 silver badges179 bronze badges

3 Comments

jonrsharpe Over a year ago

Good answer, but why not just time the actual string -> datetime bit, rather than complicating things with the CSV file?

Anand S Kumar Over a year ago

Good suggestion, would do that. Thank you.

Karl M.W. Over a year ago

Very interesting insight Anand, thank you. Whilst marginal gains on small files, when you get to larger lists, this insight could have great optimisation gains.

Collectives™ on Stack Overflow

Python datetime formatted string to datetime object

2 Answers 2

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related