29

I have a field that comes in as a string and represents a time. Sometimes its in 12 hour, sometimes in 24 hour. Possible values:

  1. 8:26
  2. 08:26am
  3. 13:27

Is there a function that will convert these to time format by being smart about it? Option 1 doesn't have am because its in 24 hour format, while option 2 has a 0 before it and option 3 is obviously in 24 hour format. Is there a function in Python/ a lib that does:

time = func(str_time)
2
  • related: Converting string into datetime Commented Jun 26, 2015 at 13:17
  • +1 for the specific focus on "format not known" (i.e. confusion between e.g. dd/mm and mm/dd is not a concern). If it was known, dateutil would be an unreliable choice. Commented Jun 26, 2015 at 16:51

3 Answers 3

48

super short answer:

from dateutil import parser
parser.parse("8:36pm")
>>>datetime.datetime(2015, 6, 26, 20, 36)
parser.parse("18:36")
>>>datetime.datetime(2015, 6, 26, 18, 36)

Dateutil should be available for your python installation; no need for something large like pandas

If you want to extract the time from the datetime object:

t = parser.parse("18:36").time()

which will give you a time object (if that's of more help to you). Or you can extract individual fields:

dt = parser.parse("18:36")
hours = dt.hour
minute = dt.minute
Sign up to request clarification or add additional context in comments.

13 Comments

Can I do this without dateutil? Problem is I'm running on Google App Engine, and using libraries outside of Python STL is an issue.
@DebnathSinha: Python STL? what's that? Also, if you know there's only three types of things and don't want to use an external library (although your question specifically asked for that), write the parser yourself. It's really not hard with string.split(":") and the likes.
Ur right, my bad, I had mentioned a library was ok. STL isn't the right term for Python, was borrowing from my C++ days, I meant standard library. The issue is that App Engine doesn't allow us to install any library, but only use the standard library. Is dateutil all Python code (no C)? If so, I might be able to include it from source in my source code tree rather than having to install it. Think that might work. Thanks!
@DebnathSinha: I don't think dateutil is pure python. BTW, STL is only nearly the right term from your C++ days :) There's a nice article stackoverflow.com/questions/5205491/…
@DebnathSinha: Might be that dateutil is python only, bazaar.launchpad.net/~dateutil/dateutil/trunk/files
|
16

there is one such function in pandas

import pandas as pd
d = pd.to_datetime('<date_string>')

Comments

0

Using regex to cut string into ['year', 'month', 'day', 'hour', 'minutes', 'seconds'] then unpack it and fill into datetime class datetime.datetime(year, month, day, hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold=0) , this is the fastest way I tested so far.

    import re
    import pandas as pd
    import datetime
    import timeit

    def date2timestamp_anyformat(format_date):
        numbers = ''.join(re.findall(r'\d+', format_date))
        if len(numbers) == 8:
            d = datetime.datetime(int(numbers[:4]), int(numbers[4:6]), int(numbers[6:8]))
        elif len(numbers) == 14:
            d = datetime.datetime(int(numbers[:4]), int(numbers[4:6]), int(numbers[6:8]), int(numbers[8:10]), int(numbers[10:12]), int(numbers[12:14]))
        elif len(numbers) > 14:
            d = datetime.datetime(int(numbers[:4]), int(numbers[4:6]), int(numbers[6:8]), int(numbers[8:10]), int(numbers[10:12]), int(numbers[12:14]), microsecond=1000*int(numbers[14:]))
        else:
            raise AssertionError(f'length not match:{format_date}')
        return d.timestamp()

and speed test:

    print('regex cut:\n',timeit.timeit(lambda: datetime.datetime(*map(int, re.split('-|:|\s', '2022-08-13 12:23:44.234')[:-1])).timestamp(), number=10000))
    print('pandas to_datetime:\n', timeit.timeit(lambda: pd.to_datetime('2022-08-13 12:23:44.234').timestamp(), number=10000))
    print('datetime with known format:\n',timeit.timeit(lambda: datetime.datetime.strptime('2022-08-13 12:23:44.234', '%Y-%m-%d %H:%M:%S.%f').timestamp(), number=10000))
    print('regex get number first:\n',timeit.timeit(lambda: date2timestamp_anyformat('2022-08-13 12:23:44.234'), number=10000))
    print('dateutil parse:\n', timeit.timeit(lambda: parser.parse('2022-08-13 12:23:44.234').timestamp(), number=10000))

result:

regex cut:
 0.040550945326685905
pandas to_datetime:
 0.8012433210387826
datetime with known format:
 0.09105705469846725
regex get number first:
 0.04557646345347166
dateutil parse:
 0.6404162347316742

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.