Summary: in this tutorial, you will learn about the sets and ranges in regular expressions.
Sets
The square brackets search for any character in a set. For example, [aeiou] matches any of the five characters: 'a', 'e', 'i', 'o' and 'u'. The [...] is called a set.
For example, the regular expression /[cbr]ats/g matches cats, bats, and rats:
let str = 'How cats, rats, and bats became Halloween animals';
let re = /[cbr]ats/g;
let results = str.match(re);
console.log(results);Code language: JavaScript (javascript)Output:
["cats", "rats", "bats"]Code language: JavaScript (javascript)Ranges
The square brackets can contain character ranges. For example, [a-z] is a character range from a to z. And [0-9] is a digit from 0 to 9.
The [a-zA-Z0-9_] is the same as \w. The [0-9] is the same as \d.
Excluding ranges
To negate a range, you use the excluding range like: [^...].
For example, [^0-9] matches any character except a digit. It is the same as \D.
Or, the [^aeiou] matches any character except 'a', 'e', 'i', 'o' and 'u'.
Escape special characters
Typically, you use a backslash to escape a special character e.g., \.. However, in square brackets, you donβt need to escape most of the special characters except they have a meaning for the square brackets.
For example, if the caret (^) is at the beginning of a string, you need to escape it:
[\^#$]Code language: JavaScript (javascript)If the caret is not at the beginning of a string (^), you do not need to escape:
[#^$]Code language: JavaScript (javascript)The flag u
If a set has surrogate pair, you need to add the flag u to the regular expression to make it work correctly:
let result = 'It is π'.match(/[ππ
π]/);
console.log(result);Code language: JavaScript (javascript)Output:
["οΏ½"]Code language: JavaScript (javascript)In this example, the [ππ π] has six characters, not three:
let str = 'ππ
π';
for(let i=0; i<str.length; i++) {
console.log(str.charCodeAt(i));
}Code language: JavaScript (javascript)Output:
55356
57166
55356
57157
55356
57171Code language: JavaScript (javascript)If you add the flag u, then the behavior will be correct:
let result = 'It is π'.match(/[ππ
π]/u);
console.log(result);Code language: JavaScript (javascript)Output:
["π"]Code language: JavaScript (javascript)Summary
- Use
[...]to construct a set to match any character in it. - Use the
-inside a set to construct a range to match any character in the range. - Use the
^to negate a range.