XML Tree parsing with condition in Python

Question

Here is my XML structure:

<images>
  <image>
<name>brain tumer</name>
<location>images/brain_tumer1.jpg</location>
<annotations>
    <comment>
        <name>Patient 0 Brain Tumer</name>
        <description>
            This is a tumer in the brain
        </description>
    </comment>
</annotations>
</image>
<image>
<name>brain tumer</name>
<location>img/brain_tumer2.jpg</location>
<annotations>
    <comment>
        <name>Patient 1 Brain Tumer</name>
        <description>
            This is a larger tumer in the brain
        </description>
    </comment>
</annotations>
</image>
</images>

I am new to Python and wanted to know if retrieving the location data based on the comment:name data was posible? In other words here is my code:

for itr1 in itemlist :
            commentItemList = itr1.getElementsByTagName('name')

            for itr2 in commentItemList:
                if(itr2.firstChild.nodeValue == "Patient 1 Liver Tumer"):
                    commentName = itr2.firstChild.nodeValue
                    Loacation = it1.secondChild.nodeValue

Any recommendations or am i missing somthing here? Thank you in advance.

alecxe · Accepted Answer · 2014-03-28 18:39:37Z

1

Parsing xml with minidom isn't fun at all, but here's the idea:

iterate over all image nodes
for each node, check comment/name text
if the text matches, get the location node's text

Example that finds location for Patient 1 Brain Tumer comment:

import xml.dom.minidom

data = """
your xml goes here
"""

dom = xml.dom.minidom.parseString(data)
for image in dom.getElementsByTagName('image'):
    comment = image.getElementsByTagName('comment')[0]
    comment_name_text = comment.getElementsByTagName('name')[0].firstChild.nodeValue
    if comment_name_text == 'Patient 1 Brain Tumer':
        location =  image.getElementsByTagName('location')[0]
        print location.firstChild.nodeValue

prints:

img/brain_tumer2.jpg

edited Mar 28, 2014 at 18:39

answered Mar 28, 2014 at 18:30

alecxe

476k127 gold badges1.1k silver badges1.2k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Mr. Concolato Over a year ago

Impressive! Thank you.

alecxe · Accepted Answer · 2014-03-28 18:36:30Z

1

Just to compare the easiness of solutions, here's how you can do the same with lxml:

from lxml import etree

data = """
your xml goes here
"""

root = etree.fromstring(data)
print root.xpath('//image[.//comment/name = "Patient 1 Brain Tumer"]/location/text()')[0]

prints:

img/brain_tumer2.jpg

Basically, one line vs six.

answered Mar 28, 2014 at 18:36

alecxe

476k127 gold badges1.1k silver badges1.2k bronze badges

Collectives™ on Stack Overflow

XML Tree parsing with condition in Python

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related