5

I got a string which looks like this:

"abcderwer 123123 10,200 asdfasdf iopjjop"

Now I want to extract numbers, following the scheme xx,xxx where x is a number between 0-9. E.g. 10,200. Has to be five digit, and has to contain ",".

How can I do that?

Thank you

6 Answers 6

11

You can use grep:

$ echo "abcderwer 123123 10,200 asdfasdf iopjjop" | egrep -o '[0-9]{2},[0-9]{3}'
10,200
Sign up to request clarification or add additional context in comments.

Comments

5

In pure Bash:

pattern='([[:digit:]]{2},[[:digit:]]{3})'
[[ $string =~ $pattern ]]
echo "${BASH_REMATCH[1]}"

Comments

1

Simple pattern matching (glob patterns) is built into the shell. Assuming you have the strings in $* (that is, they are command-line arguments to your script, or you have used set on a string you have obtained otherwise), try this:

for token; do
  case $token in
    [0-9][0-9],[0-9][0-9][0-9] ) echo "$token" ;;
  esac
done

4 Comments

That would be $@. There's a difference.
When you access the command line, there certainly is a difference between $* and $@; but I specifically chose to call the variable which contains the argv array by its original Bourne name because that's the name many beginner-level expositions use; and in this context, when you want the values split into whitespace-separated tokens, that's what a script would use. I agree that if you have to refer to the positional parameters directly in a script, "$@" is almost always what you want.
With, for example, set -- 'abc def' ghi, for arg in "$@"; do echo "$arg"; done does the right thing. These rarely do the right thing: $* or "$*" (rarely = only when you want to flatten the arguments). There's no point in continuing to repeat beginner-level expositions which are wrong.
for token; do does the right thing here, regardless of how you got the things in there; what's to argue about?
1

Check out pattern matching and regular expressions.

Links:

Bash regular expressions

Patterns and pattern matching

SO question

and as mentioned above, one way to utilize pattern matching is with grep. Other uses: echo supports patterns (globbing) and find supports regular expressions.

4 Comments

echo does not and find is not applicable to the question.
@DennisWilliamson I never said they were. And you can use patterns with echo. Example: echo *ocu?ent? would return "Documents" if you're in a regular home folder. I realize you meant regex though, I'll edit that :)
That's not a regex, that's globbing.
@DennisWilliamson which is pattern matching.
0

A slightly non-typical solution:

< input tr -cd [0-9,\ ] | tr \  '\012' | grep '^..,...$' 

(The first tr removes everything except commas, spaces, and digits. The second tr replaces spaces with newlines, putting each "number" on a separate line, and the grep discards everything except those that match your criterion.)

Comments

0

The following example using your input data string should solve the problem using sed.

$ echo abcderwer 123123 10,200 asdfasdf iopjjop | sed -ne 's/^.*\([0-9,]\{6\}\).*$/\1/p'
10,200

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.