4

Many thanks for your reading. I apologise for such a beginner question for what I am sure is a simple answer. Any guidance is much appreciated.

I have an xml file which I am parsing with ElementTree, which has elements which look like this:

data.xml:
<?xml version="1.0" encoding="utf-8"?><listings><listing id="26496000" dateFirstListed="2012-10-13" dateLastListed="2013-10-06" market="SALE" propertyType="DETACHED" bedrooms="4" latestAskingPrice="314950"><address key="u935d·0" udprn="50812465" line1="12 Millcroft" line2="Millhouse Green" town="SHEFFIELD" postcode="S36 9AR" /><description>  SOME TEXT HERE </description></listing>

I want to access <description> tag and <address key>.

Using the guide set out at https://docs.python.org/2/library/xml.etree.elementtree.html I write:

import xml.etree.ElementTree
data = xml.etree.ElementTree.parse('data.xml')
root = data.getroot()

and iterate over the child elements:

for child in root:
    print child.tag, child.attrib
>
listing {'dateLastListed': '2013-10-06', 'dateFirstListed': '2012-10-13', 'propertyType': 'DETACHED', 'latestAskingPrice': '314950', 'bedrooms': '4', 'id': '26496000', 'market': 'SALE'}

This only gives me the child elements for the <listing> tag. How can I change the above expression to access <address key> and <description>?

Edit: Following guidance from this question Parsing XML with Python - accessing elements

I tried writing:

for i in root.findall("listing"):
    description = i.find('description')
    print description.text

    >
    AttributeError: 'NoneType' object has no attribute 'text'

1 Answer 1

5

You can iterate over the listings one by one and then get the inner description and address child elements. To access the attributes, use .attrib attribute:

import xml.etree.ElementTree as ET


data = ET.parse('data.xml')
root = data.getroot()
for listing in root.findall("listing"):
    address = listing.find('address')
    description = listing.findtext('description')

    print(description, address.attrib.get("key"))
Sign up to request clarification or add additional context in comments.

3 Comments

Hi @alecxe thanks for your input. I actually tried this before (I have just updated my answer with this), and got the error AttributeError: 'NoneType' object has no attribute 'text' When I delete the description line and print part, I do get a list of address ids, but description is always returned as nonetype i.e. empty.
@ChuckM okay, probably not every listing has a description. Updated the answer, check it out.
Oh shoot, I should have said, yes that is the case. THANK YOU. This worked perfectly :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.