How to extract part of string in Bash using regex

Question

I have been trying to extract part of string in bash. I'm using it on Mac.

Pattern of input string:

Some random word follow by a /. This is optional.
Keyword (def, foo, and bar) followed by hyphen(-) followed by numbers. This can be 2-6 digit numbers
These numbers are followed by hyphens again and few hyphen separated words.

Sample inputs and outputs:

abc/def-1234-random-words // def-1234
bla/foo-12-random-words // foo-12
bar-12345-random-words // bar-12345

So I tried following command to fetch it but for some weird reason, it returns entire string.

extractedValue=`getInputString | sed -e 's/.*\(\(def\|bar\|foo\)-[^-]*\).*/\1/g'`
// and
extractedValue=`getInputString | sed -e 's/.*\(\(def\|bar\|foo\)-\d{2,6}\).*/\1/g'`

I also tried to make it case-insensitive using I flag but it threw error for me:

: bad flag in substitute command: 'I'

Following are the references I tried:

@Barmar i noticed some weird behaviour around \d. Hence i moved to [^-]*. It used to match it but always returned entire string. But I'll read more about it — Rajesh Dixit
– Rajesh Dixit, Commented Oct 6, 2021 at 15:22

Barmar · Accepted Answer · 2021-10-06 15:21:27Z

4

You can use the -E option to use extended regular expressions, then you don't have to escape ( and |.

echo abc/def-1234-random-words  | sed -E -e 's/.*((def|bar|foo)-[^-]*).*/\1/g'
def-1234

answered Oct 6, 2021 at 15:21

Barmar

789k57 gold badges555 silver badges669 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Rajesh Dixit Over a year ago

This along with gsed for case-insensitivity flag /I solved my issue. Thanks a TON!

anubhava · Accepted Answer · 2021-10-06 15:26:53Z

2

This gnu sed should work with ignore case flag:

sed -E 's~^(.*/){0,1}((def|foo|bar)-[0-9]{2,6})-.*~\2~I' file

def-1234
foo-12
bar-12345

This sed matches:

(.*/){0,1}: Match a string upto / optionally at the start
(: Start capture group #2
- (def|foo|bar): Match def or foo or bar
- -: Match a -
- [0-9]{2,6}: Match 2 to 6 digits
): End capture group #2
-.*: Match - followed by anything till end
Substitution is value we capture in group #2

Or you may use this awk:

awk -v IGNORECASE=1 -F / 'match($NF, /^(def|foo|bar)-[0-9]{2,6}-/) {print substr($NF, 1, RLENGTH-1)}' file

def-1234
foo-12
bar-12345

Awk explanation:

-v IGNORECASE=1: Enable ignore case matching
-F /: Use / as field separator
match($NF, /^(def|foo|bar)-[0-9]{2,6}-/): Match text using regex ^(def|foo|bar)-[0-9]{2,6}- in $NF which is last field using / as field separator (to ignore text before /)
If match is successful then using substr print text from position 1 to RLENGTH-1 (since we matching until - after digits)

edited Oct 6, 2021 at 15:26

answered Oct 6, 2021 at 15:18

anubhava

790k67 gold badges603 silver badges671 bronze badges

4 Comments

Rajesh Dixit Over a year ago

Could you please also add explanation? What $NF means and is this case sensitive?

anubhava Over a year ago

I am going to add. Meanwhile check sed which will do ignore case matchig

Rajesh Dixit Over a year ago

Weird thing is, sed approach is still throwing this error: : bad flag in substitute command: 'I'. Is it environment specific? I'm using ZSH over Mac terminal

anubhava Over a year ago

Yes as I mentioned that requires gnu sed. sed on Mac is BSD and that doesn't support /I. I am also on Mac but have gnu sed installed using home brew

N1ngu · Accepted Answer · 2022-10-24 16:00:49Z

0

Use grep with the --only-matching option (shorthand -o).

grep --only-matching --extended-regexp '(foo|bar|def)-[0-9]{2,6}' <<EOF
abc/def-1234-random-words
bla/foo-12-random-words
bar-12345-random-words
EOF

answered Oct 24, 2022 at 16:00

N1ngu

4,0252 gold badges26 silver badges54 bronze badges

Collectives™ on Stack Overflow

How to extract part of string in Bash using regex

Sample inputs and outputs:

3 Answers 3

1 Comment

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Sample inputs and outputs:

3 Answers 3

1 Comment

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related