0

I have to parse xml to csv, using xmllint --shell or xmllint --xpath, because I'm not allowed to instal additional packages.

I need firstname and phone in the csv, and nothing else. I tried this to loop through an xml and parse it to csv file, but the problem is when First name has space (for example Mary Jane) or the phone is missing. Then this kind of solution does not work.

for f in $(echo 'cat //FIRSTNAME/text()' | xmllint --shell TEST.xml | sed '1d;$d' | sed 's/-------//') 
do
   echo $f  >> $CSV_FILE_NAMES
done

for i in $(echo 'cat //HOMEPHONE/text()' | xmllint --shell TEST.xml | sed '1d;$d' | sed 's/-------//') 
do
   echo $i  >> $CSV_FILE_PHONES
done


paste -d "," $CSV_FILE_NAMES $CSV_FILE_PHONES >> $CSV

Or this combined solution, which places every entity in a new line:

for f in $(echo 'cat //FIRSTNAME/text()|//HOMEPHONE/text()' | xmllint --shell TEST.xml | sed '1d;$d' | sed 's/-------//')
do
   echo $f  >> $CSV_FILE 
done
Mark

9999999999

Jack

8888888888

Is there a different way to loop through an xml file?

XML example

3
  • See mywiki.wooledge.org/DontReadLinesWithFor and shellcheck.net for validating your script. Commented Mar 3, 2021 at 13:25
  • Also you have to check there are CRLF line endings from the xml file. Commented Mar 3, 2021 at 13:30
  • 'because I'm not allowed to instal additional packages.'. I wonder what you do have available. e.g. you may have a Perl installation with XML libs, or a Python installation with XML libs, or, or... ? Commented Mar 3, 2021 at 14:55

3 Answers 3

4

In The XML Sample that you have provided, I think it would be simpler to loop over all the ZVM_DATA then use the XPath concat function to concatenate the FIRSTNAME, HOMEPHONE, or any other fields you'd like to include:

for index in $(seq $(xmllint --xpath "count(//ZVM_DATA)" test.xml))
do  
    xmllint --xpath "concat(//ZVM_DATA[$index]/FIRSTNAME/text(),',',//ZVM_DATA[$index]/HOMEPHONE/text())" --format test.xml
done

It is not the cleanest but unfortunately, xmllint supports only Xpath 1.0 otherwise it could be done in one command.

Edit: the result should look like this:

Michael ,7800002814
E,7800907671
Ryan,7909355223
Sign up to request clarification or add additional context in comments.

Comments

4

Given an XML file file.xml

<PEOPLE>
    <PERSON>
        <FIRSTNAME>Alice</FIRSTNAME>
        <HOMEPHONE>555-1212</HOMEPHONE>
    </PERSON>
    <PERSON>
        <FIRSTNAME>Bob</FIRSTNAME>
        <HOMEPHONE>123-4567</HOMEPHONE>
    </PERSON>
</PEOPLE>

Then

echo 'cat (//FIRSTNAME | //HOMEPHONE)/text()' | xmllint --shell file.xml

outputs

/ >  -------
Alice
 -------
555-1212
 -------
Bob
 -------
123-4567
/ >

which is readily parsable with awk, among other tools:

echo 'cat (//FIRSTNAME | //HOMEPHONE)/text()' | xmllint --shell file.xml | awk '
  NR % 4 == 2 {printf "%s,", $0}
  NR % 4 == 0 {print $0}
'
Alice,555-1212
Bob,123-4567

Too bad you can't install other tools: makes it pretty easy to format your output the way you like:

xmlstarlet sel -t -m //PERSON -v ./FIRSTNAME -o , -v ./HOMEPHONE -n file.xml
Alice,555-1212
Bob,123-4567

Comments

0

You can use awk in a following way

awk 'BEGIN{FS="[<|>]"} /FIRSTNAME/ { v1=$3 } /HOMEPHONE/ { v2=$3 } /\/ZVM_DATA/ {printf "%s, %s\n", v1,  v2}'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.