1

I have this XML file:

<?xml version="1.0" ?><XMLSchemaPalletLoadTechData xmlns="http://tempuri.org/XMLSchemaPalletLoadTechData.xsd">
  <TechDataParams>
    <RunNumber>sample</RunNumber>
    <Holder>sample</Holder>
    <ProcessToolName>sample</ProcessToolName>
    <RecipeName>sample</RecipeName>
    <PalletName>sample</PalletName>
    <PalletPosition>sample</PalletPosition>
    <IsControl>sample</IsControl>
    <LoadPosition>sample</LoadPosition>
    <HolderJob>sample</HolderJob>
    <IsSPC>sample</IsSPC>
    <MeasurementType>sample</MeasurementType>
  </TechDataParams>
  <TechDataParams>
    <RunNumber>sample</RunNumber>
    <Holder>sample</Holder>
    <ProcessToolName>sample</ProcessToolName>
    <RecipeName>sample</RecipeName>
    <PalletName>sample</PalletName>
    <PalletPosition>sample</PalletPosition>
    <IsControl>sample</IsControl>
    <LoadPosition>sample</LoadPosition>
    <HolderJob>sample</HolderJob>
    <IsSPC>sample</IsSPC>
    <MeasurementType>XRF</MeasurementType>
  </TechDataParams>
</XMLSchemaPalletLoadTechData>

And this is my code for parsing the xml:

for data in xml.getElementsByTagName('TechDataParams'):
    #parse xml
    runnum=data.getElementsByTagName('RunNumber')[0].firstChild.nodeValue
    hold=data.getElementsByTagName('Holder')[0].firstChild.nodeValue
    processtn=data.getElementsByTagName('ProcessToolName'[0].firstChild.nodeValue)
    recipedata=data.getElementsByTagName('RecipeName'[0].firstChild.nodeValue)
    palletna=data.getElementsByTagName('PalletName')[0].firstChild.nodeValue
    palletposi=data.getElementsByTagName('PalletPosition')[0].firstChild.nodeValue
    control = data.getElementsByTagName('IsControl')[0].firstChild.nodeValue
    loadpos=data.getElementsByTagName('LoadPosition')[0].firstChild.nodeValue
    holderjob=data.getElementsByTagName('HolderJob')[0].firstChild.nodeValue
    spc = data.getElementsByTagName('IsSPC')[0].firstChild.nodeValue
    mestype = data.getElementsByTagName('MeasurementType')[0].firstChild.nodeValue

but when i print each node, i am only getting one set of 'TechDataParams', but I want to be able to get all 'TechDataParams' from the XML.

Let me know if my question is a bit unclear.

3 Answers 3

1

Please don't dive into parsing XML with minidom, unless you want your hair to be pulled out by yourself.

I would use xmltodict module here. One line and you have a list of dicts with all the data you need:

import xmltodict

data = """your xml here"""

data = xmltodict.parse(data)['XMLSchemaPalletLoadTechData']['TechDataParams']
for params in data:
    print dict(params)

Prints:

{u'PalletPosition': u'sample', u'HolderJob': u'sample', u'RunNumber': u'sample', u'ProcessToolName': u'sample', u'RecipeName': u'sample', u'IsControl': u'sample', u'PalletName': u'sample', u'LoadPosition': u'sample', u'MeasurementType': u'sample', u'Holder': u'sample', u'IsSPC': u'sample'}
{u'PalletPosition': u'sample', u'HolderJob': u'sample', u'RunNumber': u'sample', u'ProcessToolName': u'sample', u'RecipeName': u'sample', u'IsControl': u'sample', u'PalletName': u'sample', u'LoadPosition': u'sample', u'MeasurementType': u'XRF', u'Holder': u'sample', u'IsSPC': u'sample'}
Sign up to request clarification or add additional context in comments.

Comments

0

Here is an example for you. Replace file_path with your own.

I replace value of RunNumber with 001 and 002.

# -*- coding: utf-8 -*-
#!/usr/bin/python

from xml.dom import minidom

file_path = 'C:\\temp\\test.xml'

doc = minidom.parse(file_path)
TechDataParams = doc.getElementsByTagName('TechDataParams')
for t in TechDataParams:
    num = t.getElementsByTagName('RunNumber')[0]
    print 'num is ', num.firstChild.data

OUTPUT:

num is  001
num is  002

1 Comment

thank you! i will also try this method and see what works best!
0

Also by lxml.etree module.

  1. Input contain namespace i.e. http://tempuri.org/XMLSchemaPalletLoadTechData.xsd
  2. Use xpath method to find target TechDataParams tags.
  3. Get children of TechDataParams tag and create dictionary which key is tag name and value is text of tag.
  4. Append to list varaible which is TechDataParams.

code:

from lxml import etree
root = etree.fromstring(content)
TechDataParams_info = []
for  i in root.xpath("//a:XMLSchemaPalletLoadTechData/a:TechDataParams", namespaces={"a": 'http://tempuri.org/XMLSchemaPalletLoadTechData.xsd'}):
    temp = dict()
    for j in i.getchildren():
        temp[j.tag.split("}", 1)[-1]] = j.text
    TechDataParams_info.append(temp)

print TechDataParams_info

output:

[{'PalletPosition': 'sample', 'HolderJob': 'sample', 'RunNumber': 'sample', 'ProcessToolName': 'sample', 'RecipeName': 'sample', 'IsControl': 'sample', 'PalletName': 'sample', 'LoadPosition': 'sample', 'MeasurementType': 'sample', 'Holder': 'sample', 'IsSPC': 'sample'}, {'PalletPosition': 'sample', 'HolderJob': 'sample', 'RunNumber': 'sample', 'ProcessToolName': 'sample', 'RecipeName': 'sample', 'IsControl': 'sample', 'PalletName': 'sample', 'LoadPosition': 'sample', 'MeasurementType': 'XRF', 'Holder': 'sample', 'IsSPC': 'sample'}]

1 Comment

thank you! i will also try this method and see what works best!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.