1

I have a long string like following:

string='<span id="/yourid/12345" class="noname">lala1</span><span id="/yourid/34567" class="noname">lala2</span><span id="/yourid/39201" class="noname">lala3</span>'

The objective is to loop through each of the 'yourid' and echo the id 12345, 34567 and 39201 for further processing. How can this be achieve through bash shell?

1
  • 3
    bash might be a bad choice. If you can, go with a language which has XML support such as Perl, Python, or TCL. Commented Jul 2, 2013 at 2:39

3 Answers 3

3

GNU grep:

grep -oP '(?<=/yourid/)\d+' <<< "$string"
12345
34567
39201
Sign up to request clarification or add additional context in comments.

Comments

2

Use a real XML parser. For instance, if you have XMLStarlet installed...

while read -r id; do
  [[ $id ]] || continue
  printf '%s\n' "${id#/yourid/}"
done < <(xmlstarlet sel -m -t '//span[@id]' -v ./@id -n <<<"<root>${string}</root>")

1 Comment

+1 xmlstarlet may not be as likely to be installed, but it's convenient that you can use ad hoc xpath expressions on the command-line. AFAIK, xsltproc requires you to use a stylesheet file.
1

With Perl:

declare -a ids
ids=( $(perl -lne 'while(m!yourid/(\w+)!g){print $1}' <<< "$string") )
echo ${ids[@]}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.