1

I am new to python or coding , so please be patient with my question,

So here's my busy XML

    <?xml version="1.0" encoding="utf-8"?>
<Total>
    <ID>999</ID>
    <Response>
        <Detail>
        <Nix>
            <Check>pass</Check>
        </Nix>  
        <MaxSegment>
            <Status>V</Status>
            <Input>
                <Name>
                    <First>jack</First>
                    <Last>smiths</Last>
                </Name>
                <Address>
                <StreetAddress1>100 rodeo dr</StreetAddress1>
                <City>long beach</City>
                <State>ca</State>
                <ZipCode>90802</ZipCode>
                </Address>
                <DriverLicense>
                    <Number>123456789</Number>
                    <State>ca</State>
                </DriverLicense>
                <Contact>
                <Email>[email protected]</Email>
                <Phones>
                    <Home>0000000000</Home>
                    <Work>1111111111</Work>
                </Phones>
                </Contact>
            </Input>
            <Type>Regular</Type>
        </MaxSegment>
        </Detail>
    </Response>
</Total>

what I am trying to do is extract these value into nice and clean table below :
enter image description here

Here's my code so far.. but I couldn't figure it out how to get the subchild :

   import os
os.chdir('d:/py/xml/')

import xml.etree.ElementTree as ET
tree = ET.parse('xxml.xml')
root=tree.getroot()
x = root.tag
y = root.attrib
print(x,y)

#---PRINT ALL NODES---
for child in root:
    print(child.tag, child.attrib)

Thank you in advance !

2 Answers 2

2

You could create a dictionary that maps the column names to xpath expressions that extract corresponding values e.g.:

xpath = {
  "ID": "/Total/ID/text()",
  "Check": "/Total/Response/Detail/Nix/Check/text()", # or "//Check/text()"
}

To populate the table row:

row = {name: tree.xpath(path) for name, path in xpath.items()}

The above assumes that you use lxml that support the full xpath syntax. ElementTree supports only a subset of XPath expressions but it might be enough in your case (you could remove "text()" expression and use el.text in this case) e.g.:

xpath = {
  "ID": ".//ID",
  "Check": ".//Check",
}
row = {name: tree.findtext(path) for name, path in xpath.items()}

To print all text with corresponding tag names:

import xml.etree.cElementTree as etree

for _, el in etree.iterparse("xxm.xml"):
    if el.text and not el: # leaf element with text
       print el.tag, el.text

If column names differ from tag names (as in your case) then the last example is not enough to build the table.

Sign up to request clarification or add additional context in comments.

Comments

2

This is how you could traverse the tree and print only the text nodes:

def traverse(node):
    show = True
    for c in node.getchildren():
        show = False
        traverse(c)
    if show:
        print node.tag, node.text

for you example I get the following:

traverse(root)

ID 999
Check pass
Status V
First jack
Last smiths
StreetAddress1 100 rodeo dr
City long beach
State ca
ZipCode 90802
Number 123456789
State ca
Email [email protected]
Home 0000000000
Work 1111111111
Type Regular

Instead of printing out you could store (node.tag, node.text) tuples or store {node.tag: node.text} in a dict.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.