2
<attribute>
  <name>Index</name>
  <values>
   <zip>
     <value>323800</value>
   </zip>
   <nation>
     <value>195300</value>
   </nation>
  </values>
</attribute>
<attribute>
 <name>Value_1</name>
 <values>
  <nation>
   <value>193800</value>
  </nation>
 </values>
</attribute>
<attribute>
 <name>Value_2</name>
 <values>
  <zip>
   <value>1000</value>
  </zip>
  <nation>
   <value>2000</value>
  </nation>
 </values>
</attribute>

Above is an extract from a larger xml tree I am working with. I want to create a dictionary where the text for the name tag is the key and the value is the zip/value. How can I build a code to grab only the attribute names for which a zip value exists and disregard ones which do not have a zip value and only have the nation value.

My code:

   import urllib2
   import xml.etree.ElementTree as ET
   tree = ET.parse(urllib2.urlopen("http://www.sample_xml.com"))
   # creating list of names
   names = node.text for node in tree.findall('.//attribute/name')]
   zip_values = [node.text for node in tree.findall('.//zip/value')]

From here I would combine the two lists into a dictionary. But right now the lists I am getting look like this and there is a mismatch of Keys to values:

   names = ('Index', 'Value_1', 'Value_2')
   zip_values = ('323800', '1000')

Really what I need is

   my_dict = ['Index':'323800', 'Value_2':'1000']

But what I get with my code is below. Is there a way to workaround this?

   my_dict = ['Index':'323800', 'Value_1':'1000', 'Value_2:'Na']

1 Answer 1

1
import urllib2
from lxml import etree
root = etree.fromstring(urllib2.urlopen("http://www.sample_xml.com").read())
# creating list of names

d = {}
for attribute_node in root.xpath('//attribute[./values/zip/value]'):
     d[attribute_node.xpath('./name')[0].text] = attribute_node.xpath('./values/zip/value')[0].text

print d # {'Index':'323800', 'Value_2':'1000'}
Sign up to request clarification or add additional context in comments.

2 Comments

Could you please explain how exactly this code works and what is the logic behind it? Would be very helpful. Thanks!
everything is on the xpath query, where it only gets attributes that later contain a ./values/zip/value.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.