4

I have a series of CSVs with a column containing a Python datetime-formatted string. Whilst parsing the CSV files (which could be tens of thousands of rows long), I want the date column to be converted from a string to an actual datetime object.

An example CSV row:

['0', '(2011, 12, 11, 15, 45, 20)', 'Arduino/libraries/dallas-temperature-control/'],

As you can see, the date is represented in the CSV in datetime format, but as a string.

I am looking for a fast way to build the datetime object without resorting to running it through datetime.strptime(row[1], "(%Y, %m, %d, %H, %M, %S)") - it seems counter-intuitive to have to interpret the date with strptime when it's ready to drop in as-is.

4
  • 2
    You could ast.literal_eval the string to a tuple of integers, then unpack that straight into the datetime constructor: datetime.datetime(*literal_eval(row[1])) Commented Sep 7, 2015 at 9:09
  • Is this pandas? Also wouldn't make sense to store the str in a format that can be more easily parsed? Commented Sep 7, 2015 at 9:09
  • @EdChum - The CSV output comes straight from ZipFile.infolist(), so there's no intermediary interpreter at the moment. Commented Sep 7, 2015 at 9:12
  • 2
    Perfect thanks @jonrsharpe. Do you want to add that as an answer so I can accept it? Commented Sep 7, 2015 at 9:15

2 Answers 2

4

You can use ast.literal_eval to convert the string to a tuple of integers:

>>> import ast
>>> ast.literal_eval('(2011, 12, 11, 15, 45, 20)')
(2011, 12, 11, 15, 45, 20)

You can then unpack this (see e.g. What does ** (double star) and * (star) do for parameters?) straight into the datetime constructor:

>>> import datetime
>>> datetime.datetime(*ast.literal_eval('(2011, 12, 11, 15, 45, 20)'))
datetime.datetime(2011, 12, 11, 15, 45, 20)
Sign up to request clarification or add additional context in comments.

Comments

4

Like @jonrhsarpe has said in his answer, you can use ast.literal_eval to convert the string to a tuple and then unpack it into the string.

But based on the following tests, it seems like the faster method would still be to use datetime.datetime.strptime(). Example -

Code -

import datetime
import ast

def func1(datestring):
    return datetime.datetime(*ast.literal_eval(datestring))

def func2(datestring):
    return datetime.datetime.strptime(datestring, '(%Y, %m, %d, %H, %M, %S)')

Timing information -

In [39]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 30.1 µs per loop

In [40]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 26.9 µs per loop

In [41]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 38.6 µs per loop

In [42]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 28.8 µs per loop

In [43]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 31.2 µs per loop

In [44]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 29.5 µs per loop

In [45]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
The slowest run took 5.51 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 32.6 µs per loop

In [46]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
The slowest run took 15.42 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 27.5 µs per loop

In [47]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 49.2 µs per loop

In [48]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 24.4 µs per loop

Not sure, where you got the information that datetime.datetime.strptime() is counter-intuitive, but I would say for parsing strings to datetime objects, you should use strptime() .

3 Comments

Good answer, but why not just time the actual string -> datetime bit, rather than complicating things with the CSV file?
Good suggestion, would do that. Thank you.
Very interesting insight Anand, thank you. Whilst marginal gains on small files, when you get to larger lists, this insight could have great optimisation gains.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.