2

I have some text like the following:

between [... between (...)] and (...) as ...

with the regex

between\s+(.+?)\s+and\s+(.+?)\s+as

I am trying to capture what are inside two pairs of parentheses, i.e., the contents bounded by between and and. However I am stuck in that it always returns content in the square brackets and the second pair of parentheses.

EDIT:

for example if the text is:

between foo whatever between bar and dummy as

I want the regex to extract 'bar' and 'dummy', not 'foo whatever between bar' and 'dummy'

10
  • 1
    Regular expressions can't easily deal with nested structures, you need a recursive descent parser. Commented Apr 10, 2018 at 21:13
  • 1
    It's not clear what output should be. Please clarify. Commented Apr 10, 2018 at 21:19
  • 1
    Prepend your regex with .* Commented Apr 10, 2018 at 21:27
  • @Barmar is it possible to use negative look ahead ? I cannot make it work though. Commented Apr 10, 2018 at 21:27
  • No, negative lookahead won't help. I assume you allow arbitrary levels of nesting. Commented Apr 10, 2018 at 21:28

3 Answers 3

3

Prepend your regex with .* or use a negative lookahead:

between(?!.*between\b)\s+(.+?)\s+and\s+(.+?)\s+as

Live demo

Sign up to request clarification or add additional context in comments.

8 Comments

This doesn't match the word dummy
Please check live demo. It does. @asdf
Eh, I'm not a fan of that lookahead. If the word "between" appears anywhere later in the string, your regex won't match anything. Like in between x and y as between -> no match. You should rewrite that to (?:(?!between\b).)*.
@revo, yes it does if the word as is prepended which it was not in the post, but it appears that is the context
Is it? I think it's about making the text matched by the first group as short as possible. If that's what the question is about, why would you suggest prepending .*? That does something completely different than your lookahead.
|
0

In the case of your specific problem, you need to match the whitespace leading up to the between in question (which seems to be the limiting factor). I was able to achieve the results with the following:

^.+(\s)between\s(.+?)\sand\s(.+?)$

Comments

0

Try this :

between\s([^\s]+)\sand\s([^\s]+)

https://regex101.com/r/Vjkr0h/5

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.