119

The function to get a datetime from a string, datetime.strptime(date_string, format) requires a string format as the second argument. Is there a way to build a datetime from a string without without knowing the exact format, and having Python best-guess it?

5
  • 2
    possible duplicate of Is there any python library for parsing dates and times from a natural language? Commented Feb 29, 2012 at 22:32
  • 7
    Differentiating between mm/dd/yyyy vs. dd/mm/yyyy is an interesting problem, with disastrous results if you get it wrong. Commented Feb 29, 2012 at 22:46
  • 1
    It depends how inexact you mean to be when you say, "without the exact format." Could you give examples of the types of inputs you want to be able to handle? Or, could you potentially have partial info about the format (such as whether the year is 2 or 4 digits, or whether the month precedes the day or vice versa)? Without at least some basic info, even a person can't do what you ask. Is 01/02/12 Feb 1st 2012, Jan 2nd 2012, Feb 12th 2001, Dec 2nd 2001, or something else? Commented Feb 29, 2012 at 23:46
  • 1
    github.com/jeffreystarr/dateinfer Commented Jul 26, 2017 at 17:54
  • @denfromufa I get the following error while importing dateinfer on Python3: from infer import infer ModuleNotFoundError: No module named 'infer' Commented Jun 21, 2018 at 6:47

5 Answers 5

183

Use the dateutil library.

I was already using dateutil as an indispensable lib for handling timezones
(See Convert UTC datetime string to local datetime and How do I convert local time to UTC in Python?)

And I've just realized it has date parsing support:

import dateutil.parser
yourdate = dateutil.parser.parse(datestring)

(See also How do I translate a ISO 8601 datetime string into a Python datetime object?)

Sign up to request clarification or add additional context in comments.

6 Comments

Great suggestion. It can parse any formatted date/time from a string.
I know it's old. but it doesn't handle this string date type "Thursday, 21 May 2020 07:05:00 GMT" because the day is full written. Any suggestion on that one ?
Good approach for a single string but not great for an array
@YoëlZerbib I just tested your string. Seems like it has been fixed. I am on Python 3.10.2
Can't believe this actually exists, the amount of hassle datetime objects have caused me over the years, sigh. Thank you.
|
36

Can get away with a simple function if only checking against dates.

def get_date(s_date):
    date_patterns = ["%d-%m-%Y", "%Y-%m-%d"]

    for pattern in date_patterns:
        try:
            return datetime.datetime.strptime(s_date, pattern).date()
        except:
            pass
    
    print("Date is not in expected format: %s").format(s_date)

2 Comments

Much quicker than using dateutil provided your date format is covered.
I think this enumerative approach with silent fails on all attempted bad formats can be best used to handle edge cases in an error handler after the usual (standard, expected) date format conversion has already failed.
10

Back before I was a python guy, I was a perl guy. One of the things that I've always missed but haven't seen anything close to it is Date::Manip. That module can extract a good timestamp from a smattering of nibbles. I almost suspect that it's author struck a deal with the Devil.

I've run across a few things that take stabs at it in Python:

If you find anything better I'd love to hear about it though.

4 Comments

Thanks for the recommendations- See my answer though, think I found my own answer with the dateutil library.
what the heck is a smattering of nibbles?
@wordsforthewise A nibble is a half-byte, a smattering is a small, scattered amount.
hilarious, sounds like something out of hairy potter
9

You can use datefinder ,It will detect all types of natural style of dates.

import datefinder # Module used to find different style of date with time

string_value = " created 01/15/2005 by ACME inc.and associates.January 4th,2017 at 8pm"
matches = datefinder.find_dates(string_value)            
for match in matches:
    print("match found ",match)

Output

match found  2005-01-15 00:00:00
match found  2017-01-04 20:00:00

3 Comments

Unlike dateutil, datefinder can't parse a bare month, eg. "July" (without either a day or a year.) This is kind of a major limitation that would seem to be a trivial fix.
Cant find the date in 02-08-2021 - 10_789_0107987_1_165
Ah shame, wish it could output the string format to parse dates with
3

if pandas is already imported, it has a function which fits the bill - pd.to_datetime. In my experience this works with a wide range of date formats.

Be careful with ambiguity about day/month first: is 01/02/2000 the first of February, or the 2nd of January?

Demo:

dts = ['2018-09-30',
'2020-9-8',
'25-12-2018',
'2018-12-25 23:50:55',
'10:15:35.889 AM',
'10:15:35.889 PM',
'2018-12-25 23:50:55.999',
'2018-12-25 23:50:55.999 +0530'
]

pd.DataFrame([{'string': dt, 'datetime': pd.to_datetime(dt)} for dt in dts])

enter image description here

Note that the third value in the list triggers the warning UserWarning: Parsing dates in %d-%m-%Y format when dayfirst=False (the default) was specified - because here it is clearly day first. If it wouldn't be clear, it would be assumed that it is month first, and potentially give the wrong datetime.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.