@@ -5203,10 +5203,37 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
52035203 The quantifiers <literal>{1,1}</> and <literal>{1,1}?</>
52045204 can be used to force greediness or non-greediness, respectively,
52055205 on a subexpression or a whole RE.
5206+ This is useful when you need the whole RE to have a greediness attribute
5207+ different from what's deduced from its elements. As an example,
5208+ suppose that we are trying to separate a string containing some digits
5209+ into the digits and the parts before and after them. We might try to
5210+ do that like this:
5211+ <screen>
5212+ SELECT regexp_matches('abc01234xyz', '(.*)(\d+)(.*)');
5213+ <lineannotation>Result: </lineannotation><computeroutput>{abc0123,4,xyz}</computeroutput>
5214+ </screen>
5215+ That didn't work: the first <literal>.*</> is greedy so
5216+ it <quote>eats</> as much as it can, leaving the <literal>\d+</> to
5217+ match at the last possible place, the last digit. We might try to fix
5218+ that by making it non-greedy:
5219+ <screen>
5220+ SELECT regexp_matches('abc01234xyz', '(.*?)(\d+)(.*)');
5221+ <lineannotation>Result: </lineannotation><computeroutput>{abc,0,""}</computeroutput>
5222+ </screen>
5223+ That didn't work either, because now the RE as a whole is non-greedy
5224+ and so it ends the overall match as soon as possible. We can get what
5225+ we want by forcing the RE as a whole to be greedy:
5226+ <screen>
5227+ SELECT regexp_matches('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
5228+ <lineannotation>Result: </lineannotation><computeroutput>{abc,01234,xyz}</computeroutput>
5229+ </screen>
5230+ Controlling the RE's overall greediness separately from its components'
5231+ greediness allows great flexibility in handling variable-length patterns.
52065232 </para>
52075233
52085234 <para>
5209- Match lengths are measured in characters, not collating elements.
5235+ When deciding what is a longer or shorter match,
5236+ match lengths are measured in characters, not collating elements.
52105237 An empty string is considered longer than no match at all.
52115238 For example:
52125239 <literal>bb*</>
0 commit comments