1

I have the following string

my_string = "this data is F56 F23 and G87"

And I would like to use regex to return the following output

['F56 F23', 'G87']

So basically, I'm interested in returning all the parts of the string that start with either F or G and are followed by two numbers. In addition, if there are multiple consecutive occurrences I would like regex to group them together.

I approached the problem with python and with this code

import re
re.findall(r'\b(F\d{2}|G\d{2})\b', my_string)

I was able to get all the occurrences

['F56', 'F23', 'G87']

But I would like to have the first two groups together since they are consecutive occurrences. Any ideas of how I can achieve that?

1
  • yep, I mean that there's just one white space between them Commented May 15, 2017 at 15:44

3 Answers 3

4

You can use this regex:

\b[FG]\d{2}(?:\s+[FG]\d{2})*\b

Non-capturing group (?:\s+[FG]\d{2})* will find zero or more of the following space separated F/G substrings.

Code:

>>> my_string = "this data is F56 F23 and G87"
>>> re.findall(r'\b[FG]\d{2}(?:\s+[FG]\d{2})*\b', my_string)
['F56 F23', 'G87']
Sign up to request clarification or add additional context in comments.

Comments

3

So basically, I'm interested in returning all the parts of the string that start with either F or G and are followed by two numbers. In addition, if there are multiple consecutive occurrences I would like regex to group them together.

You can do this with:

\b(?:[FG]\d{2})(?:\s+[FG]\d{2})*\b

in case it is separated by at least one spacing character. If that is not a requirement, you can do this with:

\b(?:[FG]\d{2})(?:\s*[FG]\d{2})*\b

Both the first and second regex generate:

>>> re.findall(r'\b(?:[FG]\d{2})(?:\s+[FG]\d{2})*\b',my_string)
['F56 F23', 'G87']
>>> re.findall(r'\b(?:[FG]\d{2})(?:\s*[FG]\d{2})*\b',my_string)
['F56 F23', 'G87']

Comments

0
print map(lambda x : x[0].strip(), re.findall(r'((\b(F\d{2}|G\d{2})\b\s*)+)', my_string))

change your regex to r'((\b(F\d{2}|G\d{2})\b\s*)+)' (brackets around, /s* to find all, that are connected by whitespaces, a + after the last bracket to find more than one occurance (greedy)

now you have a list of lists, of which you need every 0th Argument. You can use map and lambda for this. To kill last blanks I used strip()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.