0

I want to match one or more occurance for ex. of blah using expr.

This works.

$ expr "blahblahblah" : 'blahblah'
8

What is wrong with this regular expression?

$ expr "blahblahblah" : '\(blah\)\+'
blah

I want the number of characters matched.

2
  • 2
    Why is "using expr" a part of the question, rather than "regex-matching a string and knowing how many characters matched"? Commented Jul 23, 2012 at 16:42
  • expr is an external command, not a part of bash -- if you really want an expr command to do this, it isn't a bash question at all. Commented Jul 23, 2012 at 16:49

3 Answers 3

3

Since your question is tagged bash, there are better facilities than expr in modern versions of the shell, which do exactly what you want:

$ re='(blah)+'
$ [[ foo_blahblah_bar =~ $re ]] && echo "${#BASH_REMATCH[0]}"
8
Sign up to request clarification or add additional context in comments.

1 Comment

Yes, I am using it in a bash script, so bash only syntax definetely makes sense.
1

First of all, you need \(\) instead of () and \+ instead of +. But that is not all.

You can't use groups () and get the length of matched string simultaneously.

Pattern matches return the string matched between ( and ) or null; if ( and ) are not used, they return the number of characters matched or 0.

You must use wc to get the length of the string:

$ expr "blahblahblahblah" : '\(\(blah\)\+\)' | wc -c
17

Or using parameters expansion:

$ m=$(expr "blahblahblahblah" : '\(\(blah\)\+\)') 
$ echo ${#m}
16

(wc -c counts the end of line also, hence the difference).

But if you can write the regular expression without groups, you get the length:

$ expr "blahhhhhbl" : "blah\+"
8

5 Comments

No reason to use wc -- you can get the length with parameter expansion, which avoids the overhead of forking off a subprocess, setting up a pipeline, capturing its output, etc.
...well, I said that before running type expr; turns out it's an external command, so there's fork+wait+read overhead in running it no matter what.
@CharlesDuffy: anyway ${#a} looks better than wc -c. And your answer ([[ =~ ]]) is even better
Guys thanks, I am impressed! I resort to bash only in seldom need situations, but I am going to be coming back here more often in future.
bash is a really mighty thing. I think, you will be excited
0

Using sed:

echo "blahblahblah" | sed -n 's!\(\(blah\)*\).*!\1!p' | wc -c

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.