I'm trying to write a regex pattern which will either match a number or a number and a trailing string. So match should ouput:
Matching "string100": [('100', '')]
Matching "string900_TYPE": [('900', 'TYPE')]
But instead, I get:
Matching "string100": [('100', '')]
Matching "string900_TYPE": [('900', ''), ('', 'TYPE')]
The idea is to have the number as the first item in the tuple and the "TYPE" as the second, so I can easily determine whether "TYPE" exists in a tuple or not (second tuple item is empty --> '').
Code:
import re
stringList = ["string100", "string900_TYPE"]
pattern = r"(\d{3})|\w(TYPE)"
for string in stringList:
match = re.findall(pattern, string)
print('Matching "' + string + '":\t', match)
Thanks in advance.
(\d{3})(?:_(\w+))?. Don't use an alternation, describe the full string.+(\d{3})_?(\w+)?which requires less steps for the regex engine. Is there any reason for using a non-capturing group around the subgroup(\w+)?(\d{3})_?(\w+)?(note there must be no+at the start) then you may also match123_.(\d{3})(?:_(\w+))?is best here since(?:_(\w+))?makes the whole sequence of patterns optional.+must have slipped in somehow...123_does not seem to be captured, though.