1

I am a bit stuck on a project I am doing which uses Python -which I am very new to. I have been told to use ElementTree and get specified data out of an incoming XML file. It sounds simple but I am not great at programming. Below is a (very!) tiny example of an incoming file along with the code I am trying to use.

I would like any tips or places to go next with this. I have tried searching and following what other people have done but I can't seem to get the same results. My aim is to get the information contained in the "Active", "Room" and "Direction" but later on I will need to get much more information.

I have tried using XPaths but it does not work too well, especially with the namespaces the xml uses and the fact that an XPath for everything I would need would become too large. I have simplified the example so I can understand the principle to do, as after this it must be extended to gain more information from an "AssetEquipment" and multiple instances of them. Then end goal would be all information from one equipment being saved to a dictionary so I can manipulate it later, with each new equipment in its own separate dictionary.

Example XML:

<AssetData>
<Equipment>
    <AssetEquipment ID="3" name="PC960">
        <Active>Yes</Active>
        <Location>
            <RoomLocation>
                <Room>23</Room>
                <Area>
                    <X-Area>-1</X-Area>
                    <Y-Area>2.4</Y-Area>
                </Area>
            </RoomLocation>
        </Location>
        <Direction>Positive</Direction>
        <AssetSupport>12</AssetSupport>
    </AssetEquipment>
</Equipment>

Example Code:

tree = ET.parse('C:\Temp\Example.xml')
root = tree.getroot()

ns = "{http://namespace.co.uk}"

for equipment in root.findall(ns + "Equipment//"):
    tagname = re.sub(r'\{.*?\}','',equipment.tag)
    name = equipment.get('name')

    if tagname == 'AssetEquipment':
        print "\tName: " + repr(name)
        for attributes in root.findall(ns + "Equipment/" + ns + "AssetEquipment//"):
            attname = re.sub(r'\{.*?\}','',attributes.tag)
            if tagname == 'Room': #This does not work but I need it to be found while
                                  #in this instance of "AssetEquipment" so it does not
                                  #call information from another asset instead.
                room = equipment.text
                print "\t\tRoom:", repr(room)
1
  • How about xmltodict? Commented Nov 24, 2012 at 21:28

1 Answer 1

2
import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
for elem in tree.getiterator():
    if elem.tag=='{http://www.namespace.co.uk}AssetEquipment':
        output={}
        for elem1 in list(elem):
            if elem1.tag=='{http://www.namespace.co.uk}Active':
                output['Active']=elem1.text
            if elem1.tag=='{http://www.namespace.co.uk}Direction':
                output['Direction']=elem1.text
            if elem1.tag=='{http://www.namespace.co.uk}Location':
                for elem2 in list(elem1):
                    if elem2.tag=='{http://www.namespace.co.uk}RoomLocation':
                        for elem3 in list(elem2):
                            if elem3.tag=='{http://www.namespace.co.uk}Room':
                                output['Room']=elem3.text
        print output
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.