3

I am seeing a weird behaviour when executing the same regexp matching several times:

var r = /(.*)/g
var d = "a"

console.log(r.exec(d))
console.log(r.exec(d))

This produces:

["a", "a"]
["", ""]

Why is it not matching anything the second time around?

5
  • What do you get if you try it a 3rd time? Commented Dec 13, 2012 at 21:40
  • @Beetroot-Beetroot it will keep matching the empty string at the end, never reaching the end of the cycle. Normally there isn't infinite matches and after the regex doesn't match anything, the return value is null, after which the regex starts matching at the beginning again Commented Dec 13, 2012 at 21:43
  • @Esailija, You clearly have an enviable insight into regexps and I know you are right because I just ran a test. However, I can't quite see why there's no automatic reset of lastIndex in this case. Commented Dec 13, 2012 at 22:18
  • @Beetroot-Beetroot see my answer. Because .* can always match. It will match an empty string, hence lastIndex will not advance and the next search will start from that same position again. But .* can always match an empty string again. You will only ever get null (after which the index is reset) if the pattern fails to match. But .* can by definition never fail (on anything). Commented Dec 13, 2012 at 22:26
  • Thanks, @m.buettner I understand better now. Your insight is also enviable. Commented Dec 13, 2012 at 22:33

2 Answers 2

8

That is what the g flag does. When you use it, exec will continue the next search from the end the previous match. But after your first match (a) there is nothing more left in the string, so you get an empty match. This empty match is usually used to terminate an exec-loop. If you know that there is only one match, remove the g (it means "global" search).

Note that you can (and should) get rid of those parentheses. They just cost you performance. Without them you will only get one a in the resulting array.

If you do want to consider multiple matches, but disregard that last empty match, use the loop technique:

var match;
while(match = r.exec(d))
    // process match[0] here

Note that you only need this loop if you actually have (meaningful) capturing groups. If not (if you only want to get full matches), you can use match instead as elclanrs points out:

var matches = d.match(r);

EDIT:

I just realised, most of what I said is partially true but not the actual cause of ["", ""]. What really happens is this: the first time .* matches a. The second time the engine tries to continue the search after the previous match (after a). Since your pattern has .* (which mean 0 or more characters) it will now continue to match empty strings (because they match the pattern). And matching an empty string also does not advance the position for the next search. Hence, even with .match you will get ["a", ""] (match is clever enough to abort in such a case).

In fact, if you use that regex with the loop-technique you will get an endless loop (because match = ["", ""] will obviously not cause the loop to terminate). But in any case, you should realise that your pattern is nonsensical due to the *. It can match anything (including nothing). At least use .+. For whatever purpose.

Sign up to request clarification or add additional context in comments.

9 Comments

If you have a global flag is usually more convenient to use match than creating a loop to catch all exec matches or you can use replace as well to avoid the ugly while loop.
@elclanrs global flag used with .match doesn't capture groups, but full matches only. Loop and exec is the only way to have it all.
Yeah, that's true about match but replace will work fine, and it's shorter and more convenient than the while loop which is a bit confusing. jsbin.com/owubaf/1/edit
@elclanrs do you really find that less confusing? I actually think it's quite hacky.
@elclanrs yes, but that could be considered overly clever by some
|
1

exec returns the first match and moves the internal pointer forward. In this way, a second call will return the second match.

Since there is only one match, the second call returns an empty result.

1 Comment

the important thing here is g though. without it, exec would not advance that internal pointer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.