5

I have made a simple code for capturing a certain group in a string :

/[a-z]+([0-9]+)[a-z]+/gi    (n chars , m digts , k chars).

code :

var myString='aaa111bbb222ccc333ddd';
var myRegexp=/[a-z]+([0-9]+)[a-z]+/gi;

var match=myRegexp.exec(myString);
console.log(match)
 
 while (match != null)
{
  match = myRegexp.exec(myString);
  console.log(match)
}

The result were :

["aaa111bbb", "111"]
["ccc333ddd", "333"]
null

But wait a minute , Why he didnt try the bbb222ccc part ?

I mean , It saw the aaa111bbb but then he should have try the bbb222ccc... ( That's greedy !)

What am I missing ?

Also

looking at

   while (match != null)
    {
      match = myRegexp.exec(myString);
      console.log(match)
    }

how did it progressed to the second result ? at first there was :

var match=myRegexp.exec(myString);

later ( in a while loop)

match=myRegexp.exec(myString);
match=myRegexp.exec(myString);

it is the same line ... where does it remember that the first result was already shown ?

5
  • 8
    Because the index after first match is at the end of the first match the bbb was already passed and nothing is left to match except the rest of the string which is "ccc333ddd". Greedy means that + will attempt to match as much as possible without taking into account that the next part of the regex could match it. Commented Nov 30, 2012 at 14:44
  • hi @Esailija yeah , I already understood that. but if it is greedy as it say it is , it is not. Commented Nov 30, 2012 at 14:45
  • @Esailija please paste your comment as an answer. Commented Nov 30, 2012 at 14:45
  • @Esailija can you please have a look at my edit please? Commented Nov 30, 2012 at 14:50
  • 1
    @RoyiNamir It is greedy - but it won't go back and re-check an part of an expression that it has already matched Commented Nov 30, 2012 at 14:52

2 Answers 2

4

.exec is stateful when you use the g flag. The state is kept in the regex object's .lastIndex property.

var myString = 'aaa111bbb222ccc333ddd';
var myRegexp = /[a-z]+([0-9]+)[a-z]+/gi;
var match = myRegexp.exec(myString);
console.log(myRegexp.lastIndex); //9, so the next `.exec` will only look after index 9
while (match != null) {
    match = myRegexp.exec(myString);
    console.log(myRegexp.lastIndex);
}

The state can be resetted by setting .lastIndex to 0 or by execing a different string. re.exec("") for instance will reset the state because the state was kept for 'aaa111bbb222ccc333ddd'.

The same applies to .test method as well, so never use g flag with a regex that is used for .test if you prefer no surprises. See https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/RegExp/exec

Sign up to request clarification or add additional context in comments.

4 Comments

does it mean that if I wasnt keeping the regex in a variable , and always have used raw /[a-z]+([0-9]+)[a-z]+/ , it wouldn't have remembered the index ?
@RoyiNamir yeah, when you create a new regexp object, it doesn't have any state yet. In other words, /[a-z]+([0-9]+)[a-z]+/gi.lastIndex === 0 alwys
I didn't understand the test part. why shouldn't I use test with [g] ? becuase it yields >1 results ?
@RoyiNamir because it's the g flag that makes .exec and .test work statefully. If you remove the g flag, you notice that the lastIndex is 0 after a match with .exec
2

You can also update manually the lastIndex property :

var myString='aaa111bbb222ccc333ddd';
var myRegexp=/[a-z]+([0-9]+)[a-z]+/gi;

var match=myRegexp.exec(myString);
console.log(match);

 while (match != null)
{
  myRegexp.lastIndex -= match[0].length - 1; // Set the cursor to the position just after the beginning of the previous match
  match = myRegexp.exec(myString);
  console.log(match)
}

See this link MDN exec.


EDIT :

By the way your regex should be : /[a-z]{3}([0-9]{3})[a-z]{3}/gi

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.