Python: How to Loop thoough every XML node and print values based on list

Question

For Validation Purposes: How can I search thorough a whole XML node by node(even children) as below:

XML File:

<Summary>
<Hardware_Info>
    <HardwareType>FlashDrive</HardwareType>
    <ManufacturerDetail>
            <ManufacturerCompany>Company1</ManufacturerCompany>
            <ManufacturerDate>2017-07-20T12:26:04-04:00</ManufacturerDate>
            <ModelCode>4BR6282</ModelCode>
    </ManufacturerDetail>
    <ActivationDate>2017-07-20T12:26:04-04:00</ActivationDate>
</Hardware_Info>
<DeviceConnectionInfo>
    <Device>
        <Index>0</Index>
        <Name>Laptop1</Name>
        <Status>Installed</Status>
    </Device>
    <Device>
        <Index>1</Index>
        <Name>Laptop2</Name>
        <Status>Installed</Status>
    </Device>
</DeviceConnectionInfo>
</Summary>

and search for the values based on matching columns of a particular Table. For the sake of an example the table is as is:

TABLE:

HardwareType    ManufacturerCompany    ManufacturerDate             ActivationDate              Device.Index        Name
FlashDrive      Company1               2017-07-20T12:26:04-04:00    2017-07-20T12:26:04-04:00   0                   Laptop1
FlashDrive      Company2               2017-07-20T12:26:04-04:00    2017-07-20T12:26:04-04:00   1                   Laptop2

In this case I would have a column list of:

HardwareType, ManufacturerCompany, ManufacturerDate, ActivationDate, Device.Index, Name

For my final result I would like to print the values of the table column names along with the values of the table names found on the xml. For example similar like the original table (assuming validation is fine):

OUTPUT RESULT :

 HardwareType   ManufacturerCompany    ManufacturerDate             ActivationDate              Device.Index        Name
    FlashDrive      Company1               2017-07-20T12:26:04-04:00    2017-07-20T12:26:04-04:00   0                   Laptop1
    FlashDrive      Company2               2017-07-20T12:26:04-04:00    2017-07-20T12:26:04-04:00   1                   Laptop2

CURRENT IMPLEMENTATION:

For example I am able to get the list of the column names of the table, however but so far the best to my nkowledge to implement this is:

import xml.etree.ElementTree as ET
import csv

tree = ET.parse("/test.xml")
root = tree.getroot()

f = open('/test.csv', 'w')

csvwriter = csv.writer(f)

count = 0

head = ['ManufacturerCompany','ManufacturerDate',...]

csvwriter.writerow(head)

for time in root.findall('Summary'):
     row = []
     job_name = time.find('ManufacturerDetail').find('ManufacturerCompany').text
     row.append(job_name)
     job_name = time.find('ManufacturerDetail').find('ManufacturerDate').text
     row.append(job_name)
     csvwriter.writerow(row)
f.close()

However, this implementation does not have the loop each functionality I want for the output. Any guidance or suggestions for implementation would be great.

Thanks

Parfait · Accepted Answer · 2018-02-13 19:22:07Z

1

Consider XSLT, the special purpose language designed to transform XML files into other XML, HTML (which mostly known for), but also text files (TXT/CSV) with its method="text". Specifically, walk down to the Device node level and bring over the ancestor items.

Python's third-party lxml module can run XSLT 1.0 scripts. However, XSLT is portable and any XSLT processor can run such code including Unix's (Linux/Mac) available xsltproc.

XSLT (save as .xsl file, a special .xml file; 
 is the line break entity)

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes" method="text"/>
  <xsl:strip-space elements="*"/>

  <xsl:param name="delimiter">,</xsl:param>

  <xsl:template match="/Summary">
    <xsl:text>HardwareType,ManufacturerCompany,ManufacturerDate,ActivationDate,Device.Index,Name&#xa;</xsl:text>    
    <xsl:apply-templates select="DeviceConnectionInfo"/>    
  </xsl:template>

  <xsl:template match="DeviceConnectionInfo">
    <xsl:apply-templates select="Device"/>    
  </xsl:template>

  <xsl:template match="Device">
    <xsl:value-of select="concat(ancestor::Summary/Hardware_Info/HardwareType, $delimiter,
                                 ancestor::Summary/Hardware_Info/ManufacturerDetail/ManufacturerCompany, $delimiter,
                                 ancestor::Summary/Hardware_Info/ManufacturerDetail/ManufacturerDate, $delimiter,
                                 ancestor::Summary/Hardware_Info/ActivationDate, $delimiter,
                                 Index, $delimiter,
                                 Name)"/><xsl:text>&#xa;</xsl:text>
  </xsl:template>

</xsl:stylesheet>

Python (with lxml)

import lxml.etree as et

# LOAD XML AND XSL
doc = et.parse('input.xml')
xsl = et.parse('xslt_script.xsl')

# TRANSFORM INPUT TO STRING
transform = et.XSLT(xsl)    
result = str(transform(doc))

# SAVE TO FILE
with open('output.csv', 'w') as f:
    f.write(result)

Python (one-line command call to xsltproc)

from subprocess import Popen

proc = Popen(['xsltproc -o output.csv xslt_script.xsl input.xml'], 
             shell=True, cwd='/path/to/working/directory')

Output

# HardwareType  ManufacturerCompany ManufacturerDate    ActivationDate  Device.Index    Name
# FlashDrive    Company1    2017-07-20T12:26:04-04:00   2017-07-20T12:26:04-04:00   0   Laptop1
# FlashDrive    Company1    2017-07-20T12:26:04-04:00   2017-07-20T12:26:04-04:00   1   Laptop2

answered Feb 13, 2018 at 19:22

Parfait

108k19 gold badges103 silver badges138 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Techno04335 Over a year ago

@Pairfait, Thanks for your input, I will give it a try shortly!

Parfait Over a year ago

@Techno04335 ... how did the trial go?

Techno04335 Over a year ago

@Pairfait , I would like to clarify my understanding, based on the Column list I have, am I inputting that within the XLST file? In which this process prints/outputs the element values based on column names?

Parfait Over a year ago

No column list is needed. You list headers in XSLT script, see <xsl:text>...</xsl:text> and then the other template runs XPath searches to find those needed values.

Techno04335 Over a year ago

@Pairfait, I got the suggestion to work! Thanks for your assistance! I would like like your input on automating the XSLT File based on what the values I am looking for. Thanks.

|

Collectives™ on Stack Overflow

Python: How to Loop thoough every XML node and print values based on list

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related