1

I need a JS regex to match a string based only on a known first and last sub-string and number of spaces - and I don't care about the length or the nature of what is between the first and last sub-strings (other than the exact number of spaces).

The following is a possible start string (from which I get the first and last sub-strings and the number of spaces):

cat apple dog mouse

From this, I now know the string starts with cat, ends with mouse and contains exactly 3 spaces (they could be be anywhere between the ends, but they will not be consecutive).

The string I need to match against might be:

catfish mouse mouse dormouse mouse mouse

or cat mouse mouse mouse mouse mouse

So, what I need to match would be, in the first case catfish mouse mouse dormouse, and in the second case cat mouse mouse mouse - in both cases a string starting with cat, ending with mouse and containing exactly 3 spaces. At the moment, all my attempts match the entire sample string above, not just from cat to the third mouse. Here is my latest failure:

cat(?:(?![\s]{4,}).*)mouse

I have a strong suspicion I'm overthinking this - but thanks for any suggestions.

2 Answers 2

1

You can write a regex without look aheads do do this.

Example

\bcat(?:[^\s]*\s){3}[^\s]*mouse\b

Regex Demo


What it does?

  • \b Matches a word boundary. This ensures that it doesn't match strings that end as mousexyz
  • cat Matches cat at the start of the string
  • (?:[^\s]*\s){3}
    • [^\s]* Matches anything other than a space. So this one matches a single word and the following \s matches the space after the word.
    • {3} Makes sure that the single word with space is repeated 3 times.
  • [^\s]* Matches any character other than space after the 3 spaces.
  • mouse Matches mouse at the end of the string

Why doesn't cat(?:(?![\s]{4,}).*)mouse work?`

  • (?![\s]{4,}) This negative lookahead, will check if cat is not immediately followed by 4 spaces. Which is true so it matches all the input strings.
Sign up to request clarification or add additional context in comments.

10 Comments

Thanks for the speedy response. The problem is that the first and last characters of the match might be spaces. So, it would need to match cat mouse mouse mouse as well.
@sideroxylon Why don't you trim the string?
because the starting cat and mouse are part of the match, so the spaces are not at the ends.
Sorry I miss understood that part. Will update the answer. Do you want to include those spaces also in the match?
Yes, the entire string from cat -> mouse (with 3 spaces) needs to be matched and replaced. Thanks!
|
0

I'm not anywhere near as good at regex as nu11p01n73R is but I tried for fun:

/cat[^\s]*(\s[^\s]+){2}\s[^\s]*?mouse/

It is ugly but it worked when I tested it

looks for 'cat'

then runs through any non whitespace

do twice {

then looks for a space

then looks for at least one char of non whitespace

}

then runs through any non whitespace

until it finds mouse

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.