1

Having these two types of string:

1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip

1635508858063-1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip

How can I get using regex the 111040 part of the string? It has always 6 digits.

My approach is: "Take the 6 digit code after the YYYY_MM_DD_HH_MM_SS_ part", but any other approach is also welcome.

EDIT: The last part _0CM.csv.zip can be suceptible to change.

Thanks in advance.

4
  • 1
    Can't you split on underscore and take the second-to-last element? Commented Nov 2, 2021 at 8:02
  • Do you need to use regex? Commented Nov 2, 2021 at 8:05
  • 2
    Isn't 111040 six digits? Commented Nov 2, 2021 at 8:07
  • It's 6, yes. Sorry, my mistake. Commented Nov 2, 2021 at 8:12

3 Answers 3

2

You wanted a regex so here it is:

[0-9]{4}(?:_[0-9]{2}){5}_([0-9]{6})
  • [0-9]{4}: match the first 4 digits of the year, this is our starting anchor
  • (?:_[0-9]{2}){5}: after that, it follows with 5 two digit numbers (month, day, hour, minute, second) so we can just group them all and ignore them
  • ([0-9]{6}): get the 6 digits following the previous expression.

The desired number is in capture group 1 of this regex:

import re
regex = '[0-9]{4}(?:_[0-9]{2}){5}_([0-9]{6})'
re.search(regex, '1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip').group(1)
Sign up to request clarification or add additional context in comments.

1 Comment

My fault It's a 6 digit. Changing the last {5} to {6} works! Thank you"
1

How about this pattern? Works if you match each line one-by-line:

import re
pattern = re.compile('\d{4}_\d{2}_\d{2}_\d{2}_\d{2}_\d{2}_(\d{6})')
print(pattern.findall("1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip"))

Comments

1

This will return '' if an appropriate match isn't found.

import re

strings = [
    "1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip",
    "1635508858063-1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip",
    'Test'
]

pattern = re.compile('_(\d{6})_')

digits = [pattern.search(string).group(1) if pattern.search(string) else '' for string in strings]

print(digits)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.