2

I'm studying bash programming , in particular the regex and I found this code:

numpat='^[+-]([0-9]+)$'
strpat='^([a-z]*)\1$'

read stringa

if [[ $stringa =~ $numpat ]]
then
    echo "numero"
    echo numero > output
    exit ${BASH_REMATCH[1]}
elif [[ $stringa =~ $strpat ]]
then
    echo "echo"
    echo echo > output
    exit 11
fi

and I don't understand what means \1 in this line:

strpat='^([a-z]*)\1$'
1
  • 2
    Have you read any regular expression tutorials? They should explain what each of the elements of the expression means. Commented Dec 30, 2015 at 9:53

2 Answers 2

3

\1 is a backreference. It matches whatever was matched by the first capture group ([a-z]*).

So the pattern ^([a-z]*)\1$ matches a string that built from a substring that's repeated twice, such as foofoo. The capture group matches the first foo, and the backreference matches the second foo. But if the string is foobar, the backreference never matches anything, because it can't find another repetition of any of the initial strings.

You can allow any number of repetitions by using the + quantifier after \1. This matches it one or more times.

DEMO

Sign up to request clarification or add additional context in comments.

18 Comments

I also think so. However, having * as a pattern makes it greedy so it will probably not match anything else. If I test it, it doesn't work: [[ "foofoo" =~ ^([a-z]*)\1$ ]] && echo "yes" || echo "no" returns no.
While I can reproduce this behaviour, per definition extended posix regexes does not support back references.
The difference seems to be whether the pattern is in a variable or a literal.
@ginogino You can also do ^([a-z]*)\1{4}$
@Barmar: "The difference seems to be whether the pattern is in a variable or a literal" - not on my machine. it just matches 1.
|
2

On cygwin, which uses newlib, \1 matches only 1.

if [[ a1 =~ $strpat ]]; then echo YES; fi   # YES

10 Comments

Not in my bash. In my bash it works like a back reference, meaning it matches foofoo but not foobar. That's weird! I have GNU bash, version 4.3.42(1)-release.
seriously? bash 4.1.10(4)-release here on cygwin.
@hek2mgl Bash uses ERE as in man 3 regex, and back reference is not in the ERE standard, although it may be supported in some implementations.
unfortunately there are many regex variants.... no surprise there, the grep back references work just fine on my machine.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.