3

I'm using python and the re module to parse some strings and extract a 4 digits code associated with a prefix. Here are 2 examples of strings I would have to parse:

str1 = "random stuff tokenA1234 more stuff"
str2 = "whatever here tokenB5678 tokenA0123 and more there"

tokenA and tokenB are the prefixes and 1234, 5678, 0123 are the digits I need to grab. token A and B are just an example here. The prefix can be something like an address http://domain.com/ (tokenA) or a string like Id: ('[Ii]d:?\s?') (tokenB).

My regex looks like:

re.findall('.*?(?:tokenA([0-9]{4})|tokenB([0-9]{4})).*?', str1)

When parsing the 2 strings above, I get:

[('1234','')]
[('','5678'),('0123','')]

And I'd like to simply get ['1234'] or ['5678','0123'] instead of a tuple. How can I modify the regex to achieve that? Thanks in advance.

2 Answers 2

1

You get tuples as a result since you have more than 1 capturing group in your regex. See re.findall reference:

If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group.

So, the solution is to use only one capturing group.

Since you have tokens in your regex, you can use them inside a group. Since only tokens differ, ([0-9]{4}) part is common for both, just use an alternation operator between tokens put into a non-capturing group:

(?:tokenA|tokenB)([0-9]{4})
^^^^^^^^^^^^^^^^^

The regex means:

  • (?:tokenA|tokenB) - match but not capture tokenA or tokenB
  • ([0-9]{4}) - match and capture into Group 1 four digits

IDEONE demo:

import re
s = "tokenA1234tokenB34567"
print(re.findall(r'(?:tokenA|tokenB)([0-9]{4})', s)) 

Result: ['1234', '3456']

Sign up to request clarification or add additional context in comments.

Comments

1

Simply do this:

re.findall(r"token[AB](\d{4})", s)

Put [AB] inside a character class, so that it would match either A or B

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.