1

Using the following regex

<w:p.*?\$\{test\}.*?\/w:p>

I'm trying to match the first

<w:p>

before the "${test}" and the first

</w:p>

after. The after worked just fine, using the ? quantifier, but it refuses to stop at the first

<w:body><w:p w:rsidRDefault="00271ADB"/><w:p w:rsidR="00C15291"><w:pPr><w:p w:rsidR="0093632F" w:rsidRDefault="0093632F"><w:pPr><w:rPr></w:rPr></w:pPr><w:r><w:rPr></w:rPr><w:br/><w:t>${test}</w:t></w:r></w:p></w:body>

This is what I expected the result to be:

<w:p w:rsidR="0093632F" w:rsidRDefault="0093632F"><w:pPr><w:rPr></w:rPr></w:pPr><w:r><w:rPr></w:rPr><w:br/><w:t>${test}</w:t></w:r></w:p>

but instead this is what being returned

<w:p w:rsidRDefault="00271ADB"/><w:p w:rsidR="00C15291"><w:pPr><w:p w:rsidR="0093632F" w:rsidRDefault="0093632F"><w:pPr><w:rPr></w:rPr></w:pPr><w:r><w:rPr></w:rPr><w:br/><w:t>${test}</w:t></w:r></w:p>

This is the result in the editor: https://i.sstatic.net/4ri4C.png

And this is the result I'm expecting: https://i.sstatic.net/W87K9.png

1 Answer 1

3

You'll have to change the first .*? into a repeated group with a negative lookahead. You should also notice that I added a \s after <w:p, this is so <w:pPr doesn't get matched. If you have some <w:p> instances, you may need to change this to <w:p(?:\s|>).

<w:p\s(?:(?!<w:p\s).)*?\$\{test\}.*?\/w:p>

Demo


RegEx matches from left to right, so there is no real way to say "lazy before". Instead of .*? I used (?:(?!<w:p\s).)*?. Lets break that down:

(?:         (?# begin non-capturing group for grouping/repetition)
  (?!       (?# begin negative lookahead)
    <w:p\s  (?# no <w:p ahead)
  )         (?# end negative lookahead)
  .         (?# match any character)
)*?         (?# lazy repetition)

How this works is as soon as we match <w:p\s, we enter the non-capturing/repeated group. It does a zero-length assertion to make sure <w:p\s doesn't exist ahead of that point, and then matches a character. This lazily repeats until we hit ${test}. If the expression sees a <w:p\s in the lookahead, it will fail..and a new match will start back up, matching that <w:p\s in the beginning (and starting to do more lookaheads).

Sign up to request clarification or add additional context in comments.

2 Comments

Very clever! Thank you Sam for the explanation as well :)
Nice detailed answer delving into laziness. :) +1

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.