0

I am working with firefox under debian, and I don't understand the comportment of javascript:

var testRegex = /yolo .+ .+/gu;
let test = `yolo 2 abc
yolo 2 abc`;

test = test.split('\n');

for (let t=0; t < test.length; t++)
{
    console.log(test[t], testRegex.exec(test[t]));
}

And it send back:

Console result

Something even stranger:

for (let t=0; t < test.length; t++)
{
    console.log(test[t], testRegex.exec(test[t]), test[t].match(testRegex));
}

send back:

Console result

I don't think it could be an encoding problem, or from my code.

What can I do?

6
  • 1
    A "global" regex is stateful. It starts its search from the position after the end of the previous match. You're searching two different strings, but the regex doesn't know that, so it still starts its second search several indices into the string. Commented Jun 19, 2019 at 9:02
  • In fact, getting rid of the g flag make it work, but I don't understand why it could be the source of the problem. Commented Jun 19, 2019 at 9:04
  • The key here is that JavaScript RegExp objects are stateful when they have the global or sticky flags set, see developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/…. Commented Jun 19, 2019 at 9:04
  • @TheDelta: Add the following to your logging: testRegex.lastIndex. That shows the index it's using as the starting point. On the second search, its starting point will be well past the beginning of your expected match. Commented Jun 19, 2019 at 9:07
  • @ziggywiggy Oh.. Ok then :) I tought the search was done from the beginning of each new string searched. Commented Jun 19, 2019 at 9:09

1 Answer 1

2

This is actually expected behaviour, believe it or not. The exec() method on a JavaScript regex is stateful and intended to be something that one would call within a loop. Each subsequent execution will return the next match within the string until no further matches are found, at which point null will be returned.

To highlight this in your first example, let's quickly simplify the code a bit and show what values are in each variable.

let testRegex = /yolo .+ .+/gu;
let test = [
  "yolo 2 abc",
  "yolo 2 abc"
]

This results in your calls to testRegex.exec looking something like the following:

testRegex.exec("yolo 2 abc") // => Array ["yolo 2 abc"]
testRegex.exec("yolo 2 abc") // => null

You'll find the official documentation for this here where they state:

If your regular expression uses the "g" flag, you can use the exec() method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property (test() will also advance the lastIndex property). Note that the lastIndex property will not be reset when searching a different string, it will start its search at its existing lastIndex.

The reason why the second example you provide does not run into this issue is that the match() function resets the lastIndex property to 0 internally, resetting the search location and resulting in the second call to exec() searching from the start of the regular expression.

Coming back to your original example, you could modify it as follows and you would see the behaviour you're expecting:

var testRegex = /yolo .+ .+/gu;
let test = `yolo 2 abc
yolo 2 abc`;

test = test.split('\n');

for (let t=0; t < test.length; t++)
{
    console.log(test[t], testRegex.exec(test[t]));
    testRegex.lastIndex = 0;
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.