Reading XML file and fetching its attributes value in Python

Question

I have this XML file:

<domain type='kmc' id='007'>
  <name>virtual bug</name>
  <uuid>66523dfdf555dfd</uuid>
  <os>
    <type arch='xintel' machine='ubuntu'>hvm</type>
    <boot dev='hd'/>
    <boot dev='cdrom'/>
  </os>
  <memory unit='KiB'>524288</memory>
  <currentMemory unit='KiB'>270336</currentMemory>
  <vcpu placement='static'>10</vcpu>

Now, I want parse this and fetch its attribute value. For instance, I want to fetch the uuid field. So what should be the proper method to fetch it, in Python?

What have you tried? Googling "python xml" yields quite a few really useful results that should point you in the right direction. — Blender
– Blender, Commented Sep 5, 2012 at 21:34
there are a lot of examples but not pointing in the direction i want to go. I wnat to fetch attributes value. the examples i am seeing are to convert to xml file or to convert form an xml file — S.Ali
– S.Ali, Commented Sep 5, 2012 at 21:36

Alan W. Smith · Accepted Answer · 2017-06-29 15:56:31Z

30

Here's an lxml snippet that extracts an attribute as well as element text (your question was a little ambiguous about which one you needed, so I'm including both):

from lxml import etree
doc = etree.parse(filename)

memoryElem = doc.find('memory')
print memoryElem.text        # element text
print memoryElem.get('unit') # attribute

You asked (in a comment on Ali Afshar's answer) whether minidom (2.x, 3.x) is a good alternative. Here's the equivalent code using minidom; judge for yourself which is nicer:

import xml.dom.minidom as minidom
doc = minidom.parse(filename)

memoryElem = doc.getElementsByTagName('memory')[0]
print ''.join( [node.data for node in memoryElem.childNodes] )
print memoryElem.getAttribute('unit')

lxml seems like the winner to me.

edited Jun 29, 2017 at 15:56

Alan W. Smith

25.6k5 gold badges73 silver badges101 bronze badges

answered Sep 6, 2012 at 4:55

ron rothman

18.4k7 gold badges47 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Stevoisiak Over a year ago

This method is also compatible with xml.etree.ElementTree, which is included with Python 2 and 3.

yesIamFaded Over a year ago

The first try with lxml does not work for me ... when I want to acces .text it says that Nonetype object has no attribute 'text'.

d.danailov · Accepted Answer · 2013-07-12 19:54:27Z

13

XML

<data>
    <items>
        <item name="item1">item1</item>
        <item name="item2">item2</item>
        <item name="item3">item3</item>
        <item name="item4">item4</item>
    </items>
</data>

Python :

from xml.dom import minidom
xmldoc = minidom.parse('items.xml')
itemlist = xmldoc.getElementsByTagName('item') 
print "Len : ", len(itemlist)
print "Attribute Name : ", itemlist[0].attributes['name'].value
print "Text : ", itemlist[0].firstChild.nodeValue
for s in itemlist :
    print "Attribute Name : ", s.attributes['name'].value
    print "Text : ", s.firstChild.nodeValue

answered Jul 12, 2013 at 19:54

d.danailov

9,8704 gold badges53 silver badges36 bronze badges

Comments

Ali Afshar · Accepted Answer · 2012-09-05 21:34:50Z

3

etree, with lxml probably:

root = etree.XML(MY_XML)
uuid = root.find('uuid')
print uuid.text

answered Sep 5, 2012 at 21:34

Ali Afshar

41.8k12 gold badges98 silver badges111 bronze badges

Comments

Wai Yip Tung · Accepted Answer · 2012-09-05 21:38:09Z

0

Other people can tell you how to do it with the Python standard library. I'd recommend my own mini-library that makes this a completely straight forward.

>>> obj = xml2obj.xml2obj("""<domain type='kmc' id='007'>
... <name>virtual bug</name>
... <uuid>66523dfdf555dfd</uuid>
... <os>
... <type arch='xintel' machine='ubuntu'>hvm</type>
... <boot dev='hd'/>
... <boot dev='cdrom'/>
... </os>
... <memory unit='KiB'>524288</memory>
... <currentMemory unit='KiB'>270336</currentMemory>
... <vcpu placement='static'>10</vcpu>
... </domain>""")
>>> obj.uuid
u'66523dfdf555dfd'

http://code.activestate.com/recipes/534109-xml-to-python-data-structure/

answered Sep 5, 2012 at 21:38

Wai Yip Tung

18.9k10 gold badges46 silver badges49 bronze badges

Comments

Mike Pennington · Accepted Answer · 2012-09-29 12:02:03Z

0

I would use lxml and parse it out using xpath //UUID

edited Sep 29, 2012 at 12:02

Mike Pennington

43.2k22 gold badges140 silver badges191 bronze badges

answered Sep 5, 2012 at 21:35

Paul J. Warner

1031 silver badge4 bronze badges

Comments

Stephen Rauch · Accepted Answer · 2017-12-21 07:00:21Z

0

Above XML does not have closing tag, It will give

etree parse error: Premature end of data in tag

Correct XML is:

<domain type='kmc' id='007'>
  <name>virtual bug</name>
  <uuid>66523dfdf555dfd</uuid>
  <os>
    <type arch='xintel' machine='ubuntu'>hvm</type>
    <boot dev='hd'/>
    <boot dev='cdrom'/>
  </os>
  <memory unit='KiB'>524288</memory>
  <currentMemory unit='KiB'>270336</currentMemory>
  <vcpu placement='static'>10</vcpu>
</domain>

edited Dec 21, 2017 at 7:00

Stephen Rauch♦

50.1k32 gold badges118 silver badges143 bronze badges

answered Dec 21, 2017 at 6:40

Vaibhav Awasthi

212 bronze badges

Comments

Edmond Sylar · Accepted Answer · 2019-01-17 01:15:14Z

0

You can try parsing it with using (recover=True). you can do something like this.

parser = etree.XMLParser(recover=True)
tree = etree.parse('your xml file', parser)

I used this recently and it worked for me, you can try and see but in case you need to do any more complecated xml data extractions, you can take a look at this code i wrote for some project handling complex xml data extractions.

answered Jan 17, 2019 at 1:15

Edmond Sylar

12 bronze badges

Collectives™ on Stack Overflow

Reading XML file and fetching its attributes value in Python

7 Answers 7

2 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

2 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related