Summary: in this tutorial, you will learn about the lazy quantifiers to find the smallest match in an input string.
Introduction to lazy quantifiers
Lazy quantifiers, also known as non-greedy quantifiers, are a feature in regular expressions that modify the behavior of quantifiers to match as little as possible. They provide the smallest possible match that satisfies the regular expression pattern.
By default, quantifiers in regular expressions are greedy, meaning they match as much as possible. However, lazy quantifiers work in the opposite way. They match as little as possible while still allowing the overall pattern to be satisfied.
Lazy quantifiers are denoted by appending a question mark (?) to the standard quantifiers.
The following table shows the greedy quantifiers, lazy quantifiers, and the meanings of the lazy quantifiers:
| Greedy Quantifiers | Lazy Quantifiers | Lazy Quantifier Meaning |
|---|---|---|
| * | *? | Match zero or more occurrences (as few as possible) |
| + | +? | Match one or more occurrences (as few as possible) |
| ? | ?? | Match zero or one occurrence (preferably zero) |
| {n} | {n}? | Match exactly n occurrences |
| {n,} | {n,}? | Match n or more occurrences |
| {n,m} | {n,m}? | Match between n and m occurrences (as few as possible) |
Lazy quantifiers example
The following example illustrates how to use a lazy quantifier to extract attribute values of an input tag:
using System.Text.RegularExpressions;
using static System.Console;
var html = """<input type="submit" values="Send">""";
var pattern = """
".+?"
""";
var matches = Regex.Matches(html, pattern);
foreach (var match in matches)
{
WriteLine(match);
}Code language: C# (cs)Output:
"submit"
"Send"Code language: JSON / JSON with Comments (json)How it works.
- The program begins with the necessary
usingdirective to include theRegexclass from theSystem.Text.RegularExpressionsnamespace. - The program then includes
using static System.Console;to allow the usage of theWriteLinemethod without explicitly specifying theConsoleclass. - The HTML string is defined as
htmlusing a raw string (""") that contains an HTML input element with the attributetype="submit"andvalues="Send". - The regular expression pattern is defined as
patternusing a raw string with triple quotes (""") to avoid escaping the ” inside the regular expression. The pattern".+?"matches the attributes of the input HTML tag including quotes (“). The lazy quantifier?ensures that the match is as small as possible. - The program uses the
Regex.Matches()method to find all matches of the pattern in the HTML stringhtml. It takes thehtmlstring and thepatternas arguments and returns a collection ofMatchobjects representing the matches found. - The program then iterates over each
Matchobject in thematchescollection using aforeachloop and use theWriteLinemethod to print each match to the console inside the loop.
Summary
- A lazy quantifier in regular expressions matches as little as possible while still satisfying the pattern.