Using javascript's RegExp exec method

The exec method off the RegExp object in javascript can cause some confusion if you're not used to it. Let's say we had the following simple regex:

var regex = /the/i;  

So we're looking for the word "the" with the case-insensitive flag enabled so it will match "the", "THE", "The", etc. Here's some sample subject text:

var subject = 'The quick brown fox jumps over the lazy dog';  

If we go with this "as is", we'll get something like the following:

 var match = regex.exec(subject);
 if (match) {
     alert(match[0]);
 }
 ```

This will alert "The", but what about the other instance of "the" in the sentence? How come that wasn't matched? Oh! We forgot to set the global flag on the regex object. Now it should look like this:

javascript var regex = /the/ig;

And the matching code now looks like:

javascript var match = null; while ( (match = regex.exec(subject)) ) { alert(match[0]); } ```

Now we get both instances of "the" in the sentence. So what exactly is going on here? Let's slightly change that last block of code to this:

javascript var match = null; while ( (match = regex.exec(subject)) ) { alert(match[0] + ' ' + regex.lastIndex); }

So here, we see the regex object has a property called lastIndex that is set after every iteration of "exec()". If you run the code, you'll get something like "The 3" and "the 34". Once "exec()" finds a match, it sets the lastIndex property of the regex to the character right after the matched text. The next time it runs through the loop, it checks this lastIndex position and starts from there.

I'll end this with a couple of cautionary warnings:

  • Don't set the regex inside the loop. This would cause the lastIndex property to always be initialized to zero, hence, infinite loop if it was run against text with a match.
  • Modifying the subject string during the exec loop is dangerous and can also lead to an infinite loop. Hope this helps some people!