1

I am using regular expressions in python to finds dates like 09/2010 or 8/1976 but not 11/12/2010. I am using the following lines of codes but it does not work in some cases.

r'([^/](0?[1-9]|1[012])/(\d{4}))'
4
  • Is this enough \b(?:\d{1,2}\/)?\d{1,2}\/\d{2,4}\b ? Commented Jul 25, 2019 at 15:32
  • No, that does not work. It returns 24/1990 (which is drawn from 5/24/1990). Commented Jul 25, 2019 at 15:39
  • \b(?:\d{1,2}\/)?\d{1,2}\/\d{2}(?:\d{2})?\b With the \b it forces it to be the start of the word. Commented Jul 25, 2019 at 15:39
  • That uses month instead of day in a case like 11/1985 Commented Jul 25, 2019 at 15:45

3 Answers 3

1

This, a little bit explicit code, uses re.sub and datetime.strptime to parse/validate the input string:

import re
import datetime

s = '09/2010, 8/1976, 11/8/2010, 09/06/15, 12/1987, 13/2011, 09/13/2001'

r = re.compile(r'\b(\d{1,2})/(?:(\d{1,2})/)?(\d{2,4})\b')

def validate_date(g, parsed_values):
    if not g.group(2) is None:
        s = '{:02d}/{:02d}/{:04d}'.format(*map(int, g.groups()))
    else:
        s = '01/{:02d}/{:04d}'.format(int(g.group(1)), int(g.group(3)))

    try:
        datetime.datetime.strptime(s, '%d/%m/%Y')
        parsed_values.append(g.group())
        return
    except:
        pass

parsed_values = []
r.sub(lambda g: validate_date(g, parsed_values), s)

print(parsed_values)

Prints:

['09/2010', '8/1976', '11/8/2010', '09/06/15', '12/1987']

EDIT: Shortened the code.

Sign up to request clarification or add additional context in comments.

Comments

1
import re

rgx = "(?:\d{1,2}\/)?\d{1,2}\/\d{2}(?:\d{2})?"
dates = "09/2010, 8/1976, 11/12/2010, 09/06/15 .."

result = re.findall(rgx, dates)
print(result)
# ['09/2010', '8/1976', '11/12/2010', '09/06/15']

1 Comment

You should have specified you were working with dataframes in your question. For your future questions try giving more context so we can help you better :)
0

After working on this problem I came to this solution:

This works very well!

df['text'].str.extractall(r'(?P<Date>(?P<month>\d{1,2})/?(?P<day>\d{1,2})?/(?P<year>\d{2,4}))')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.