1

I have a string in the form of:

"[NUM : NAME : NUM]: [NUM : NAME : NUM]:..."

I want to be able to extract all the NAMEs out of this string. The NAME can have any character, ranging from alphabet to punctuation symbols and numbers. NUM is only in the form of [0-9]+

I tried issuing this command:

re.findall(r"\[[0-9]+\:([.]+)\:[0-9]+\]", string)

But instead of giving what I requested, it would bunch up a few [NUM : NAME : NUM]s into the [.]+ group, which is also correct in terms of this regex, but not what I need.

Any help would be much appreciated.

2 Answers 2

2

Try this:

re.findall(r"\[[0-9]+\:(.+?)\:[0-9]+\]", string)

Adding the ? after the + is non-greedy. Greedy means that the + will take as many characters as possible while still matching and it is greedy by default. By adding the ? you are telling it to be non-greedy, which means the + will take the minimum number of characters to match.

The above will work if there are no spaces between num, :, and name.

If there are spaces then use:

re.findall(r"\[[0-9]+ \: (.+?) \: [0-9]+\]", string)
Sign up to request clarification or add additional context in comments.

Comments

1
  • First problem is that you have enclosed . inside a character class. So, you have lost the meaning of ., and it only matches just a dot(.).

  • Secondly, you are not considering spaces after the numbers in your string.

  • Thirdly, you need to use reluctant quantifier with your .+ in the center. So, replace - ([.]+) with (.+?).

  • Fourthly, you don't need to escape your colons (:).

You can try out this: -

>>> re.findall(r'\[[0-9]+[ ]*:(.+?):[ ]*[0-9]+\]', string)
6: [' NAME ', ' NAME ']

1 Comment

Thank you Rohit, your version works well, and your explanation helped me to better understand regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.