2

A program outputs a file with lines of the following format

{Foo} Bar Bacon {Egg}

where Foo and Egg could, but do not have to, be made up of several words. Bar and Bacon always are a single word.

I need to get Bar in a variable for my further code. I imagine that this would work if I split the sting at a matching regular expression. This would return a list of the four elements and thus I could easily get out the second element with list[1].

How would I write such a regular expression?

I need to split the sting on single spaces ' ', but only if that single space is not surrounded by text in curly braces.

\s(?=[a-zA-Z{}]) gives me all the spaces and thus behaves exactly like ' '. How can I exclude the spaces in the curly braces?

1
  • Try re.search(r'(?<=} )\S+', str).group() Commented Mar 6, 2018 at 21:21

2 Answers 2

2

You can try {[^}]*}\s(\w+)

>>> import re
>>> print re.search(r'{[^}]*}\s(\w+)', '{Foo} Bar Bacon {Egg}').group(1)
Bar

Demo

Explanation:

  • {[^}]*} first you match the first section inside curly braces
  • \s then a whitespace
  • (\w+) then the second section; you put it in a capturing group, so it's available in search results as group(1)

re.search(pattern, string, flags=0)

Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding match object. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.

https://docs.python.org/3/library/re.html#re.search

Sign up to request clarification or add additional context in comments.

Comments

2

This might help.

>>> import re
>>> line = '{Foo} Bar Bacon {Egg}'
>>> m = re.search(r'}\s+(\S+)\s+', line)
>>> m.group(1)
'Bar'
>>> 

I just searched for any word that follows a close-brace. I used () to group that word so that I could access it later with m.group()

If you really want all four elements, try re.findall():

>>> line = '{Foo Goo} Bar Bacon {Egg Foo}'
>>> re.findall(r'{.*?}|\S+', line)
['{Foo Goo}', 'Bar', 'Bacon', '{Egg Foo}']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.