Having problems getting data out of an xml element in Python

Question

I am parsing xml output by another program.

Here's an example of the xml fragment:

<result test="Passed" stamp="2011-01-25T12:40:46.166-08:00">
        <assertion>MultipleTestTool1</assertion>
        <comment>MultipleTestTool1 Passed</comment>
      </result>

I want to get the data out of the <comment> element.

Here is my code snippet:

import xml.dom.minidom
mydata.cnodes = mydata.rnode.getElementsByTagName("comment")                        
    value = self.getResultCommentText( mydata.cnodes

    def getResultCommentText(self, nodelist):
            rc = []
            for node in nodelist:
                if node.nodeName == "comment":
                    if node.nodeType == node.TEXT_NODE:
                        rc.append(node.data)

        return ''.join(rc)

value is always empty, and it appears that the nodeType is always an ELEMENT_NODE, so .data doesn't exist I am new to Python, and this is causing me to scratch my head. Can anyone tell me what I'm doing wrong?

John Machin · Accepted Answer · 2011-01-25 21:46:56Z

1

Try ElementTree instead of minidom:

>>> import xml.etree.cElementTree as et
>>> data = """
... <result test="Passed" stamp="2011-01-25T12:40:46.166-08:00">
...         <assertion>MultipleTestTool1</assertion>
...         <comment>MultipleTestTool1 Passed</comment>
...       </result>
... """
>>> root = et.fromstring(data)
>>> root.tag
'result'
>>> root[0].tag
'assertion'
>>> root[1].tag
'comment'
>>> root[1].text
'MultipleTestTool1 Passed'
>>> root.findtext('comment')
'MultipleTestTool1 Passed'
>>>

edited Jan 25, 2011 at 21:46

answered Jan 25, 2011 at 21:34

John Machin

83.2k12 gold badges147 silver badges193 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

virhilo · Accepted Answer · 2011-01-25 21:29:32Z

0

here you are:

>>> from lxml import etree
>>> result = """
... <result test="Passed" stamp="2011-01-25T12:40:46.166-08:00">
...         <assertion>MultipleTestTool1</assertion>
...         <comment>MultipleTestTool1 Passed</comment>
...       </result>
... """
>>> xml = etree.fromstring(result)
>>> xml.xpath('//comment/text()')
['MultipleTestTool1 Passed']
>>>

answered Jan 25, 2011 at 21:29

virhilo

6,8232 gold badges32 silver badges26 bronze badges

Comments

cledoux · Accepted Answer · 2011-01-25 21:46:47Z

Continuing to use minidom, I've modified your code snippet to indicate the method required:

import xml.dom.minidom
mydata.cnodes = mydata.rnode.getElementsByTagName("comment")
value = self.getResultCommentText(mydata.cnodes)
  def getResultCommentText(self, nodelist):
    rc = []
    for node in nodelist:
      # Since the node list was created by getElementsByTagName("comment"),
      # all nodes in this list will be comment nodes.
      #
      # The text data required is a child of the current node
      for child in node.childNodes:
        # If the current node is a text node, append it's information
        if child.nodeType == child.TEXT_NODE:
          rc.append(child.data)
  return ''.join(rc)

Basically, what's going on is that the text data required is contained within a text node that is a child of the comment node. First, the node must be retrieved, and then the data can be retrieved.

Collectives™ on Stack Overflow

Having problems getting data out of an xml element in Python

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related