5

I'd like a regex to match thank-you but exclude it when the string contains the word removals. So contact-thank-you should return a positive but removals/contact-thank-you should return a negative

I don't know much about regex and found a couple of posts refering to negative lookaheads. The best i could come up with was

(?!(?:removals)).*thank-you

which is clearly rubbish. Could anyone help?

Thanks

4
  • Try ^(?!.*removals).*thank-you.*. It depends on what the environment is, if you need the whole string match or not. Please show your code to help you better and quicker. Are the strings multiline, BTW? . in most regex flavors does not match line break characters. Commented Jun 29, 2017 at 12:30
  • Note you may use just string.contains like methods. Commented Jun 29, 2017 at 12:35
  • What language/tools are you using? For example, if this in grep then you can do: grep 'thank-you' <file> | grep -v removals Commented Jun 29, 2017 at 13:49
  • Thanks Wiktor and Tom Commented Jul 11, 2017 at 1:11

1 Answer 1

3

What you ideally want in this case is a negative lookbehind, since you're looking "behind" (to the left) of the word you're matching to make sure something's not there.

A complication here is that many regex engines don't permit variable-width negative-lookbehinds.

But if you can anchor to the start of the string you want to match somehow, then you can use lookahead from that anchor, instead.

(?:\s|^)((?!removals)\S)+thank-you(?:\s|$)

bananas/fred-thank-you - MATCH.
bananas/fred-no-thank-you - MATCH.
bananas/thank-you-with-words-after - no match.
removals/fred-thank-you - no match.
non-removals/fred-thank-you - no match.
bananas/removals-thank-you - no match.
bananas/thank-you-supremovalsale - no match.
bananas/fred-sorry - no match.

I am presuming that the characters permitted in the string are "anything but whitespace".

So it starts out by looking for either the beginning of the string, or some whitespace; then any number of non-whitespace \S characters that aren't the beginning of the string "removals"; then the string "thank-you".


But I suspect what you're actually looking for is something a little different, maybe something like:

^(?!removals\/)\w+\/[-\w]*thank-you$

bananas/fred-thank-you - MATCH.
bananas/fred-no-thank-you - MATCH.
bananas/thank-you-with-words-after - no match.
removals/fred-thank-you - no match.
non-removals/fred-thank-you - MATCH.
bananas/removals-thank-you - MATCH.
bananas/thank-you-supremovalsale - no match.
bananas/fred-sorry - no match.

This assumes that the structure is very fixed: to include anything that ends "/blah-blah-thank-you", unless the first word is exactly "removals/". Without knowing the exact specification, though, the first seems the most likely to be helpful.


If you're not trying to extract this string from many others, but are just checking a URL to see if it matches this pattern, then you can simplify it a lot:

^(?!.*removals).*thank-you

bananas/fred-thank-you - MATCH.
bananas/fred-no-thank-you - MATCH.
bananas/thank-you-with-words-after - MATCH.
removals/fred-thank-you - no match.
non-removals/fred-thank-you - no match.
bananas/removals-thank-you - no match.
bananas/thank-you-supremovalsale - no match.
bananas/fred-sorry - no match.

This just matches any string that has "thank-you", and not "removals".

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks Dewi, that worked perfectly. The last case was the one i was looking for, but it's great to have had the full explanation above.
...did I really answer this question twice, without noticing? On the same day? Within 30 minutes of each other? I'll just quietly facepalm.
Ha! I thought you were just particularly helpful. Thanks again

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.