Assume I have a string as follows:
2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00
Where a date comes with time several times. Is it possible that regular expression can find all time after each date such as follows?
[('2021/12/23', '13:00','14:00'), ('2021/12/24', '13:00','14:00','15:00')]
I tried the following code in Python, but it returns only the first time:
re.findall(r'(\d+/\d+/\d+)(\s\d+\:\d+)+','2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00')
>>>[('2021/12/23', ' 14:00'), ('2021/12/24', ' 15:00')]
Appendix: The original problem is in fact a more complicated case, where there are several texts between time, and it is difficult to replace them to '' directly:
2021/12/31 14:00 start 15:00 end 17:00 pending 18:00 ok 2021/12/31 14:00 begin 15:00 end 17:00 start 18:00 suspend
So the robust method here should be this answer.
pattern = regex.compile(r'(?P<date>\d{4}/\d+/\d+)(?:\s?(?P<time>(\d+:\d+))\s+([^\d]+))+')
for m in pattern.finditer('2021/12/31 14:00 start 15:00 end 17:00 pending 18:00 ok 2021/12/30 14:00 begin 15:00 end 17:00 ggh 18:00 suspend'):
print(m.capturesdict())
>>> {'date': ['2021/12/31'], 'time': ['14:00', '15:00', '17:00', '18:00']}
{'date': ['2021/12/30'], 'time': ['14:00', '15:00', '17:00', '18:00']}