3

I want to use Python regular expression utility to find the files which has this pattern:

000014_L_20111026T194932_1.txt
000014_L_20111026T194937_2.txt
...
000014_L_20111026T194928_12.txt

So the files I want have an underscore '_' followed by a number (1 or more digits) and then followed by '.txt' as the extension. I used the following regular expression but it didn't match the above names:

match = re.match('_(\d+)\.txt$', file)

What should be the correct regex to match the file names?

2

1 Answer 1

14

You need to use .search() instead; .match() anchors to the start of the string. Your pattern is otherwise fine:

>>> re.search('_(\d+)\.txt$', '000014_L_20111026T194928_12.txt')
<_sre.SRE_Match object at 0x10e8b40a8>
>>> re.search('_(\d+)\.txt$', '000014_L_20111026T194928_12.txt').group(1)
'12'
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Martijin. This works. So even if I use '$' to indicate from the end of string, .match function doesn't search from the end?
@tonga: No; $ is an anchor, it'll only match at the end of the string; it won't dictate where the search begins. You should see .match() as adding an implicit ^ to your pattern.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.