0

I have a string where I want to extract the key information from:

gbk_kings_common_20171201_20180131_66000.0k_2017-12-01_TO_2018-01-31_id12_1277904128.csv

Namely, I would like to find the following:

  1. File identifier, e.g. gbk_kings_common_20171201_20180131
  2. Size, e.g. 330.0k
  3. Date, e.g. 2017-12-01_TO_2018-02-31
  4. Type of id, e.g. id12_12771231518

But I'm having a difficulty compiling the regex since the file identifier can always change in the length, although the rest of the information is pretty fixed when delimited by commas.

3
  • have you written any code for this already? Commented Aug 29, 2018 at 7:41
  • Do you have any control on the string? In such case, it is much easier if you change the character separating the different parts, since having _ both inside the identifier and between different parts of the string makes it much more difficult. Commented Aug 29, 2018 at 7:45
  • Could you specify how the file identifier varies? Does it simply add more dates? Commented Aug 29, 2018 at 7:47

1 Answer 1

4

You can use the pattern r'(.*)_(.*)_([\d-]+_TO_[\d-]+)_(id[\d_]*) to search your string.

>>> import re
>>> s = "gbk_kings_common_20171201_20180131_66000.0k_2017-12-01_TO_2018-01-31_id12_1277904128.csv"
>>> sre = re.search(r'(.*)_(.*)_([\d-]+_TO_[\d-]+)_(id[\d_]*)', s)
>>> file_id, size, date, type_id = sre.groups()
>>> print (file_id, size, date, type_id)
gbk_kings_common_20171201_20180131 66000.0k 2017-12-01_TO_2018-01-31 id12_1277904128
Sign up to request clarification or add additional context in comments.

2 Comments

I can see that gbk_kings_common_20171201_20180131_66000.0k_ refers to the groups (.*)_(.*) but how do you know that the second (.*) refer to 66000.0k and not something like 20171201_20180131_66000.0k?
@winnie99. If it matches 20171201_20180131, then the next pattern ([\d-]+_TO_[\d-]+) wouldn't match. So the first (.*), would match everything upto the last but one _ before the dates

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.