0

I would like to take out only the date part of my dataframe column which has data like

$83,00010/7/2016       10/7/2016  # is the date here
$721,0002/7/2015       2/7/2015  # is the date here
$3,00012/12/2015       12/12/2015  # is the date here

I tried patterns like

^(?!,\d{3}) and (\$[,\d{3}]*\,\d{3}) 

but I am not able to exactly pick just the date part.

4
  • 2
    Is the format month/day/year or day/month/year? Commented May 30, 2019 at 5:13
  • 1
    @Karthik Is it 02 or just 2? How would you know? Commented May 30, 2019 at 5:31
  • The month part has no fixed format. It can be 2 in some places and 02 in others. The only thing we know for sure is it starts after the comma followed by 3 digits which denotes the end of currency. Commented May 30, 2019 at 6:25
  • And the format is month/day/year. Commented May 30, 2019 at 6:26

2 Answers 2

1

This is slightly complicated, let's start with a simple expression, then we would expand it. I'm hoping that we would have , in our money part, then we might try:

,\d{3}(\d{1,2}\/.+?\/\d{4})

DEMO

or

([1-3]?\d\/.+?\/\d{4})

DEMO

Test

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"([1-3]?\d\/.+?\/\d{4})"

test_str = ("$83,00010/7/2016 ----> 10/7/2016 is the date here,\n"
    "$721,0002/7/2015 ----> 2/7/2015 is the date here,\n"
    "$3,00012/12/2015  -----> 12/12/2015 is the date here")

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

0

The following regex will return you the date:

',\d{3}(\d{1,2}/\d{1,2}/\d{4})$'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.