Get a string after multiple spaces in bash

Question

I need to extract some information from a header file, and I need to get a site name from a string like this:

0008 0080 LO Institution Name                 Site Name Here

The problem is that the site name contains spaces too. The only thing that I came up that works is saving the line as a string and then get the site name as a string after a certain number of characters, like this:

echo ${line:50}

but I'd like something more elegant.

I just noticed that it also removed multiple spaces between Institution Name and Site Name.

The spaces are lost because you forgot to quote the value. You want echo "${string:50}". See stackoverflow.com/questions/10067266/… — tripleee
– tripleee, Commented Dec 12, 2017 at 11:20
With just a single example and no explanation of which part of the string you want, this is unclear. Can you specify which part of the string you want and in what circumstances this is failing? Also, "elegant" isn't really well-defined -- I find it hard to imagine that you would find anything simpler than what you already have. — tripleee
– tripleee, Commented Dec 12, 2017 at 11:21
@tripleee: Thanks for the edit. My first time here, not familiar yet with formatting, etc. I guess by elegant I meant doing it in one line without saving it into a variable first, e.g. pipe it to sed. — Renat
– Renat, Commented Dec 12, 2017 at 11:40
And you are looking for extracting the part after the long run of spaces? Can you verify that breaking on any occurrence of two spaces is what you really want? — tripleee
– tripleee, Commented Dec 12, 2017 at 11:47

tripleee · Accepted Answer · 2017-12-12 11:49:23Z

5

If the question title is representative of your actual problem, and you want to extract the text after multiple adjacent spaces,

echo "${string##*  }"

with two spaces after the asterisk will extract a substring with the longest prefix ending with two spaces removed from the variable's value.

If you need to do this in a pipe, it's easy with sed:

something which produces the output string |
sed 's/.*  //'

edited Dec 12, 2017 at 11:49

answered Dec 12, 2017 at 11:25

tripleee

192k37 gold badges318 silver badges367 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Renat Over a year ago

That will work. The only thing that stays the same between different inputs is the multiple spaces. The rest, like length, beginning of site name, numbers of words before it, can change. I didnt know I can use multiple characters in this construction. Thanks!

Renat Over a year ago

Hm, I tried sed almost like that, only used ^ for start of the string. Didn't work obviously...

tripleee Over a year ago

^ matches the beginning of a line, but the regex .* matches everything from the start of the line anyway. sed 's/^.* //' should work just fine just as well.

Renat Over a year ago

I forgot the dot. I thought, from the start ^ everything * to double spaces ' ' will be replaced with nothing //.

tripleee Over a year ago

No, in regex * means "the previous expression zero or more times"; and . means "any single character (except newline)".

|

ntj · Accepted Answer · 2017-12-12 11:38:44Z

0

I think awk would be an optimal choice. It can extract columns easily.

echo '0008 0080 LO Institution Name                 Site Name Here'|awk '{ print $7" "$8 }'

You are able to print whatever columns you want. (And do many other things.)

answered Dec 12, 2017 at 11:38

ntj

17112 bronze badges

3 Comments

Renat Over a year ago

Unfortunately, the column numbers may change.

Renat Over a year ago

I need the last words, but it can be one, two or more. They are separated by multiple spaces from the rest of the string.

ntj Over a year ago

Ohh, so the sed solution seems to be the best.

Kalpa Gunarathna · Accepted Answer · 2017-12-12 11:13:09Z

-1

If given string format in not changed every time, following will do the trick.

A="0008 0080 LO Institution Name Site Name Here" echo $A | cut -d " " -f 6

answered Dec 12, 2017 at 11:13

Kalpa Gunarathna

1,12711 silver badges17 bronze badges

1 Comment

Renat Over a year ago

Please see added comment. There are multiple spaces between "Institution Name" and "Site Name". Even if I use treat multiple delimiters as one option, this would still only get me the first word from the site name. It may have two or more separated by spaces.

Collectives™ on Stack Overflow

Get a string after multiple spaces in bash

3 Answers 3

6 Comments

3 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related