0

I'm using the following XML:

<feed xmlns:im="http://itunes.apple.com/rss" xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<id>
https://itunes.apple.com/IN/rss/topfreeapplications/limit=200/xml
</id>
<title>iTunes Store: Top Free Apps</title>
<updated>2016-12-05T12:37:06-07:00</updated>
<link rel="alternate" type="text/html" href="https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewTop?cc=in&amp;id=134581&amp;popId=27"/>
<link rel="self" href="https://itunes.apple.com/IN/rss/topfreeapplications/limit=200/xml"/>
<icon>http://itunes.apple.com/favicon.ico</icon>
<author>
    <name>iTunes Store</name>
    <uri>http://www.apple.com/uk/itunes/</uri>
</author>
<rights>Copyright 2008 Apple Inc.</rights>
<entry>
    <updated>2016-12-05T12:37:06-07:00</updated>
<id im:id="473941634" im:bundleId="com.one97.paytm">https://itunes.apple.com/in/app/recharge-bill-payment-wallet/id473941634?mt=8&amp;uo=2</id>
<title>Recharge, Bill Payment &amp; Wallet - Paytm Mobile Solutions</title>
<summary></summary>
<im:name>Recharge, Bill Payment &amp; Wallet</im:name>
<link rel="alternate" type="text/html" href="https://itunes.apple.com/in/app/recharge-bill-payment-wallet/id473941634?mt=8&amp;uo=2"/>
<im:contentType term="Application" label="Application"/>
<category im:id="6024" term="Shopping" scheme="https://itunes.apple.com/in/genre/ios-shopping/id6024?mt=8&amp;uo=2" label="Shopping"/>
<im:artist href="https://itunes.apple.com/in/developer/paytm-mobile-solutions/id473941637?mt=8&amp;uo=2">Paytm Mobile Solutions</im:artist>
<im:price amount="0.00000" currency="INR">Get</im:price>
<im:image height="53">http://is1.mzstatic.com/image/thumb/Purple71/v4/9b/37/bf/9b37bf75-6b4d-9c95-a8a4-ea369f05ae7e/pr_source.png/53x53bb-85.png</im:image>
<im:image height="75">http://is5.mzstatic.com/image/thumb/Purple71/v4/9b/37/bf/9b37bf75-6b4d-9c95-a8a4-ea369f05ae7e/pr_source.png/75x75bb-85.png</im:image>
<im:image height="100">http://is5.mzstatic.com/image/thumb/Purple71/v4/9b/37/bf/9b37bf75-6b4d-9c95-a8a4-ea369f05ae7e/pr_source.png/100x100bb-85.png</im:image>
<rights>© One97 Communications Ltd</rights>
<im:releaseDate label="24 October 2011">2011-10-24T16:18:48-07:00</im:releaseDate>
<content type="html"></content>
</entry>
</feed>

I would like to extract the id information for each entry value: the attribute is as follows: "im:id"

from xml.dom import minidom
xmldoc = minidom.parse('topIN.xml')
itemlist = xmldoc.getElementsByTagName('link')
print(len(itemlist))
print(itemlist[0].attributes.keys())

I get information: 1 [u'href', u'type', u'rel']

But when I do the same of id, nothing returns.

2 Answers 2

1

Here is a version using xml.etree.ElementTree:

import xml.etree.ElementTree as ET

tree = ET.parse('topIN.xml')
root = tree.getroot()
ns={'im':"http://itunes.apple.com/rss", 'atom':"http://www.w3.org/2005/Atom"}
for id_ in root.findall('atom:entry/atom:id', ns):
    print (id_.attrib['{' + ns['im'] + '}id'])

Here is a version using lxml:

from lxml import etree
root=etree.parse('topIN.xml')
ns={'im':"http://itunes.apple.com/rss", 'atom':"http://www.w3.org/2005/Atom"}
print('\n'.join(root.xpath('atom:entry/atom:id/@im:id', namespaces=ns)))
Sign up to request clarification or add additional context in comments.

Comments

0

This worked:

   from xml.dom import minidom
    xmldoc = minidom.parse('topIN.xml')
    itemlist = xmldoc.getElementsByTagName('entry')
    print(len(itemlist))
    for s in itemlist:
        print s.getElementsByTagName('id')[0].attributes['im:id'].value

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.