aboutsummaryrefslogtreecommitdiffstats
path: root/man7/regex.7
diff options
context:
space:
mode:
Diffstat (limited to 'man7/regex.7')
-rw-r--r--man7/regex.770
1 files changed, 35 insertions, 35 deletions
diff --git a/man7/regex.7 b/man7/regex.7
index 4c130954e1..f313f7e024 100644
--- a/man7/regex.7
+++ b/man7/regex.7
@@ -54,7 +54,7 @@ POSIX.2 leaves some aspects of RE syntax and semantics open;
may not be fully portable to other POSIX.2 implementations.
.PP
A (modern) RE is one\*(dg or more nonempty\*(dg \fIbranches\fR,
-separated by \(aq|\(aq.
+separated by \[aq]|\[aq].
It matches anything that matches one of the branches.
.PP
A branch is one\*(dg or more \fIpieces\fR, concatenated.
@@ -62,18 +62,18 @@ It matches a match for the first, followed by a match for the second,
and so on.
.PP
A piece is an \fIatom\fR possibly followed
-by a single\*(dg \(aq*\(aq, \(aq+\(aq, \(aq?\(aq, or \fIbound\fR.
-An atom followed by \(aq*\(aq
+by a single\*(dg \[aq]*\[aq], \[aq]+\[aq], \[aq]?\[aq], or \fIbound\fR.
+An atom followed by \[aq]*\[aq]
matches a sequence of 0 or more matches of the atom.
-An atom followed by \(aq+\(aq
+An atom followed by \[aq]+\[aq]
matches a sequence of 1 or more matches of the atom.
-An atom followed by \(aq?\(aq
+An atom followed by \[aq]?\[aq]
matches a sequence of 0 or 1 matches of the atom.
.PP
-A \fIbound\fR is \(aq{\(aq followed by an unsigned decimal integer,
-possibly followed by \(aq,\(aq
+A \fIbound\fR is \[aq]{\[aq] followed by an unsigned decimal integer,
+possibly followed by \[aq],\[aq]
possibly followed by another unsigned decimal integer,
-always followed by \(aq}\(aq.
+always followed by \[aq]}\[aq].
The integers must lie between 0 and
.B RE_DUP_MAX
(255\*(dg) inclusive,
@@ -91,26 +91,26 @@ a sequence of \fIi\fR through \fIj\fR (inclusive) matches of the atom.
An atom is a regular expression enclosed in "\fI()\fP"
(matching a match for the regular expression),
an empty set of "\fI()\fP" (matching the null string)\*(dg,
-a \fIbracket expression\fR (see below), \(aq.\(aq
-(matching any single character), \(aq\(ha\(aq (matching the null string at the
-beginning of a line), \(aq$\(aq (matching the null string at the
-end of a line), a \(aq\e\(aq followed by one of the characters
+a \fIbracket expression\fR (see below), \[aq].\[aq]
+(matching any single character), \[aq]\(ha\[aq] (matching the null string at the
+beginning of a line), \[aq]$\[aq] (matching the null string at the
+end of a line), a \[aq]\e\[aq] followed by one of the characters
"\fI\(ha.[$()|*+?{\e\fP"
(matching that character taken as an ordinary character),
-a \(aq\e\(aq followed by any other character\*(dg
+a \[aq]\e\[aq] followed by any other character\*(dg
(matching that character taken as an ordinary character,
-as if the \(aq\e\(aq had not been present\*(dg),
+as if the \[aq]\e\[aq] had not been present\*(dg),
or a single character with no other significance (matching that character).
-A \(aq{\(aq followed by a character other than a digit is an ordinary
+A \[aq]{\[aq] followed by a character other than a digit is an ordinary
character, not the beginning of a bound\*(dg.
-It is illegal to end an RE with \(aq\e\(aq.
+It is illegal to end an RE with \[aq]\e\[aq].
.PP
A \fIbracket expression\fR is a list of characters enclosed in "\fI[]\fP".
It normally matches any single character from the list (but see below).
-If the list begins with \(aq\(ha\(aq,
+If the list begins with \[aq]\(ha\[aq],
it matches any single character
(but see below) \fInot\fR from the rest of the list.
-If two characters in the list are separated by \(aq\-\(aq, this is shorthand
+If two characters in the list are separated by \[aq]\-\[aq], this is shorthand
for the full \fIrange\fR of characters between those two (inclusive) in the
collating sequence,
for example, "\fI[0\-9]\fP" in ASCII matches any decimal digit.
@@ -119,15 +119,15 @@ endpoint, for example, "\fIa\-c\-e\fP".
Ranges are very collating-sequence-dependent,
and portable programs should avoid relying on them.
.PP
-To include a literal \(aq]\(aq in the list, make it the first character
-(following a possible \(aq\(ha\(aq).
-To include a literal \(aq\-\(aq, make it the first or last character,
+To include a literal \[aq]]\[aq] in the list, make it the first character
+(following a possible \[aq]\(ha\[aq]).
+To include a literal \[aq]\-\[aq], make it the first or last character,
or the second endpoint of a range.
-To use a literal \(aq\-\(aq as the first endpoint of a range,
+To use a literal \[aq]\-\[aq] as the first endpoint of a range,
enclose it in "\fI[.\fP" and "\fI.]\fP"
to make it a collating element (see below).
-With the exception of these and some combinations using \(aq[\(aq (see next
-paragraphs), all other special characters, including \(aq\e\(aq, lose their
+With the exception of these and some combinations using \[aq][\[aq] (see next
+paragraphs), all other special characters, including \[aq]\e\[aq], lose their
special significance within a bracket expression.
.PP
Within a bracket expression, a collating element (a character,
@@ -224,7 +224,7 @@ alphabet.
When an alphabetic that exists in multiple cases appears as an
ordinary character outside a bracket expression, it is effectively
transformed into a bracket expression containing both cases,
-for example, \(aqx\(aq becomes "\fI[xX]\fP".
+for example, \[aq]x\[aq] becomes "\fI[xX]\fP".
When it appears inside a bracket expression, all case counterparts
of it are added to the bracket expression, so that, for example, "\fI[x]\fP"
becomes "\fI[xX]\fP" and "\fI[\(hax]\fP" becomes "\fI[\(haxX]\fP".
@@ -236,23 +236,23 @@ as an implementation can refuse to accept such REs and remain
POSIX-compliant.
.PP
Obsolete ("basic") regular expressions differ in several respects.
-\(aq|\(aq, \(aq+\(aq, and \(aq?\(aq are
+\[aq]|\[aq], \[aq]+\[aq], and \[aq]?\[aq] are
ordinary characters and there is no equivalent
for their functionality.
The delimiters for bounds are "\fI\e{\fP" and "\fI\e}\fP",
-with \(aq{\(aq and \(aq}\(aq by themselves ordinary characters.
+with \[aq]{\[aq] and \[aq]}\[aq] by themselves ordinary characters.
The parentheses for nested subexpressions are "\fI\e(\fP" and "\fI\e)\fP",
-with \(aq(\(aq and \(aq)\(aq by themselves ordinary characters.
-\(aq\(ha\(aq is an ordinary character except at the beginning of the
+with \[aq](\[aq] and \[aq])\[aq] by themselves ordinary characters.
+\[aq]\(ha\[aq] is an ordinary character except at the beginning of the
RE or\*(dg the beginning of a parenthesized subexpression,
-\(aq$\(aq is an ordinary character except at the end of the
+\[aq]$\[aq] is an ordinary character except at the end of the
RE or\*(dg the end of a parenthesized subexpression,
-and \(aq*\(aq is an ordinary character if it appears at the beginning of the
+and \[aq]*\[aq] is an ordinary character if it appears at the beginning of the
RE or the beginning of a parenthesized subexpression
-(after a possible leading \(aq\(ha\(aq).
+(after a possible leading \[aq]\(ha\[aq]).
.PP
Finally, there is one new type of atom, a \fIback reference\fR:
-\(aq\e\(aq followed by a nonzero decimal digit \fId\fR
+\[aq]\e\[aq] followed by a nonzero decimal digit \fId\fR
matches the same sequence of characters
matched by the \fId\fRth parenthesized subexpression
(numbering subexpressions by the positions of their opening parentheses,
@@ -261,8 +261,8 @@ so that, for example, "\fI\e([bc]\e)\e1\fP" matches "bb" or "cc" but not "bc".
.SH BUGS
Having two kinds of REs is a botch.
.PP
-The current POSIX.2 spec says that \(aq)\(aq is an ordinary character in
-the absence of an unmatched \(aq(\(aq;
+The current POSIX.2 spec says that \[aq])\[aq] is an ordinary character in
+the absence of an unmatched \[aq](\[aq];
this was an unintentional result of a wording error,
and change is likely.
Avoid relying on it.