1

How to write in a Pythonic way when there are multiple regex patterns to test with and extract matched groups if a test succeeds?

That is to say, what is the Pythonic equivalent of the following code snippet?

if re.match(pattern1, string):
    m = re.match(pattern1, string)
    grps = m.groups()
    ...[process matched groups for pattern1]...
elif re.match(pattern2, string):
    m = re.match(pattern2, string)
    grps = m.groups()
    ...[process matched groups for pattern2]...
elif re.match(pattern3, string):
    m = re.match(pattern3, string)
    grps = m.groups()
    ...[process matched groups for pattern3]...
1
  • How complicated are these regex patterns? You could combine them into single regex. Or it may be easier to just put them into a list (or tuple) and loop over the list. Commented Oct 7, 2016 at 12:55

3 Answers 3

3
patterns = [pattern1, pattern2, pattern3]
for pattern in patterns:
    m = re.match(pattern, string)
    if m:
        grps = m.groups()
        ...
        break
Sign up to request clarification or add additional context in comments.

1 Comment

Different patterns need to be processed differently. So there will be additional if...else... nested deeper in the statement. Not sure if this is Pythonic.
0

Make a function for each pattern that handles the resulting groups. Then you can use list comprehension:

lst = [(pattern1, func1), (p2, f2)...]
results = [func(match.groups()) for (match, func) in [(re.match(patt, theStr), func) for (patt, func) in lst] if match]

Comments

0

Basically what I want is:

m = re.match(pattern1, string)
if m:
    grps = m.groups()
    print(pattern1)
else:
    m = re.match(pattern2, string)
    if m:
        grps = m.groups()
        print(pattern2)

And this is the best style I have found, which mimics a case statement existing in many languages:

while True:
    m = re.match(pattern1, string)
    if m:
        grps = m.groups()
        print(pattern1)
        break
    m = re.match(pattern2, string)
    if m:
        grps = m.groups()
        print(pattern2)
        break

Not sure I'm the inventor of case in Python. But from the text we can see the second one has constant indentation, while the first one has linear indentation. The algorithm professor has told us O(1) is better than O(n).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.