1

Hopefully this is a quick answer for those experienced. I have a single XML file that contains a URL in it, I'd like to take the URL from the XML and then input it into a downloader script I've written. My only issue is I can't seem to properly parse JUST the url from the XML. This is what it looks like:

<program new-version="1.1.1.1" name="ProgramName">
<download-url value="http://website.com/file.exe"/>
</program>

Thanks in advance!

2 Answers 2

9
>>> code = '''<program new-version="1.1.1.1" name="ProgramName">
... <download-url value="http://website.com/file.exe"/>
... </program>'''

With lxml:

>>> import lxml.etree
>>> lxml.etree.fromstring(code).xpath('//download-url/@value')[0]
'http://website.com/file.exe'

With the built-in xml.etree.ElementTree:

>>> import xml.etree.ElementTree
>>> doc = xml.etree.ElementTree.fromstring(code)
>>> doc.find('.//download-url').attrib['value']
'http://website.com/file.exe'

With the built-in xml.dom.minidom:

>>> import xml.dom.minidom
>>> doc = xml.dom.minidom.parseString(code)
>>> doc.getElementsByTagName('download-url')[0].getAttribute('value')
u'http://website.com/file.exe'

Which one you pick is entirely up to you. lxml needs to be installed, but is the fastest and most feature-rich library. xml.etree.ElementTree has a funky interface, and its XPath support is limited (depends on the version of the python standard library). xml.dom.minidom does not support xpath and tends to be slower, but implements the cross-plattform DOM.

Sign up to request clarification or add additional context in comments.

Comments

1
 import lxml
 from lxml import etree
 et = etree.parse("your xml file or url")
 value = et.xpath('//download-url/@value')
 print "".join(value)

output = 'http://website.com/file.exe'

you can also use cssselect

 f = open("your xml file",'r')
 values = f.readlines()
 values = "".join(values)
 import lxml.html
 doc = lxml.html.fromstring(values)
 elements = doc.cssselect('document program download-url') //csspath using firebug
 elements[0].get('value')

output = 'http://website.com/file.exe'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.