I have a regex in Python that contains several named groups. However, patterns that match one group can be missed if previous groups have matched because overlaps don't seem to be allowed. As an example:
import re
myText = 'sgasgAAAaoasgosaegnsBBBausgisego'
myRegex = re.compile('(?P<short>(?:AAA))|(?P<long>(?:AAA.*BBB))')
x = re.findall(myRegex,myText)
print(x)
Produces the output:
[('AAA', '')]
The 'long' group does not find a match because 'AAA' was used-up in finding a match for the preceding 'short' group.
I've tried to find a method to allow overlapping but failed. As an alternative, I've been looking for a way to run each named group separately. Something like the following:
for g in myRegex.groupindex.keys():
match = re.findall(***regex_for_named_group_g***,myText)
Is it possible to extract the regex for each named group?
Ultimately, I'd like to produce a dictionary output (or similar) like:
{'short':'AAA',
'long':'AAAaoasgosaegnsBBB'}
Any and all suggestions would be gratefully received.