0

I have a string s which contains two dates in it and I am trying to extract these two dates in order to subtract them from each other to count the number of days in between. In the end I am aiming to get a string like this: s = "o4_24d_20170708_20170801"

At the company I work we can't install additional packages so I am looking for a solution using native python. Below is what I have so far by using the datetime package which only extracts one date: How can I get both dates out of the string?

import re, datetime
s = "o4_20170708_20170801"
match = re.search('\d{4}\d{2}\d{2}', s)
date = datetime.datetime.strptime(match.group(), '%Y%m%d').date()
print date
6
  • How exactly do you arrive at "o4_24d_20170708_20170801"? In your input ("o4_20170708_20170801") 24 does not exist anywhere. Commented May 3, 2018 at 14:01
  • 1
    regex seems overly complicated for this. Why don't you use s.split('_')? Commented May 3, 2018 at 14:02
  • Could you post a list of possible inputs? If all of them are the same length, you can just access by index, or split by _ Commented May 3, 2018 at 14:03
  • @Ajax1234 sorry maybe it was not clear; I am planning to subtract these two dates and then indicate it in the string. Commented May 3, 2018 at 14:04
  • @Susenio: unfortunately the input is not always the same length, the strings are taken from filenames with various extensions such as e.g. "o5_20170808_20160801_test.tv14.tif" etc. Commented May 3, 2018 at 14:06

2 Answers 2

3
from datetime import datetime
import re

s = "o4_20170708_20170801"
pattern = re.compile(r'(\d{8})_(\d{8})')
dates = pattern.search(s)
# dates[0] is full match, dates[1] and dates[2] are captured groups
start = datetime.strptime(dates[1], '%Y%m%d')
end = datetime.strptime(dates[2], '%Y%m%d')
difference = end - start

print(difference.days)

will print

24

then, you could do something like:

days = 'd{}_'.format(difference.days)
match_index = dates.start()
new_name = s[:match_index] + days + s[match_index:]
print(new_name)

to get

o4_d24_20170708_20170801
Sign up to request clarification or add additional context in comments.

Comments

0
import re, datetime
s = "o4_20170708_20170801"
match = re.findall('\d{4}\d{2}\d{2}', s)
for a_date in match:
  date = datetime.datetime.strptime(a_date, '%Y%m%d').date()
  print date

This will print:

2017-07-08
2017-08-01

Your regex was working correctly at regexpal

4 Comments

The regex can just be simplified to \d{8} as well
sure, I think the issue was with re.search not doing what he wanted rather than the reg itself
@NathanBlaine Thank you, this looks very promising. However I would like to subtract the dates from each other in order to calculate the days inbetween. When I type 'date' into the console I get 'datetime.date(2017,7,8)'. Is it possible to write it into two different variables in order to calculate the days inbetween the two dates?
@kurdtc it sure is possible! You seem like a capable dude though I bet you can figure it out ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.