3

I am attempting to parse the following date strings obtained from email headers:

from dateutil import parser
d1 = parser.parse('Tue, 28 Jun 2011 01:46:52 +0200')
d2 = parser.parse('Mon, 11 Jul 2011 10:01:56 +0200 (CEST)')
d3 = parser.parse('Wed, 13 Jul 2011 02:00:01 +0000 (GMT+00:00)')

The third one fails; am I missing something obvious?

3
  • 5
    have you tried parser.parse('...', fuzzy=True)? Commented Jul 18, 2011 at 8:03
  • phimuemue, add that as an answer and I will accept it! Commented Jul 18, 2011 at 8:06
  • eryksun, that is a good suggestion. Commented Jul 18, 2011 at 8:15

2 Answers 2

4

have you tried parser.parse('...', fuzzy=True)? (I suppose it works :))

Sign up to request clarification or add additional context in comments.

1 Comment

Yes it works. The problem is the extra "+00:00" after "GMT", as pointed out below. The "fuzzy" option ignores this.
2

Give a try to parsedatetime library.

In [16]: import parsedatetime.parsedatetime as pdt

In [17]: p = pdt.Calendar()

In [18]: p.parse("Wed, 13 Jul 2011 02:00:01 +0000 (GMT+00:00)")
Out[18]: ((2011, 7, 20, 0, 0, 0, 2, 201, -1), 3)

3 Comments

But is it correct? I have difficulty interpreting the tuple. Where is the "13", for example?
It seems that this parser is confused and thinks the "Wed" refers to tomorrow July 20, which is the closest Wednesday.
Looks like parsedatetime always takes future dates. it has a comment in the source code: # if that day and month have already passed in this year, then increment the year by 1

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.