3

I am having a problem figuring out why I receive the error below

AttributeError: 'NoneType' object has no attribute 'text'

I am trying to import a XML file using Python 2.7. Below is what my XML file looks like.

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE dblp SYSTEM "file.dtd">

<top>
    <blue key="2343998978">
        <animal>lion</animal>
        <animal>seal</animal>
        <state>california</state>
        <zoo>san diego</zoo>
        <year>2015</year>
    </blue>

    <red key="9383893838739">
        <elem_a>jennifer</elem_a>
        <elem_a>paul</elem_a>
        <elem_a>carl</elem_a>
        <elem_b>kansas</elem_b>
        <elem_d>australia</elem_d>
    </red>

    <yellow key="83963277272">
        <car>chevy</car>
        <car>dodge</car>
        <cap>baseball</cap>
        <cat>tabby</cat>
    </yellow>

    <red key="9383893838739">
        <elem_a>greg</elem_a>
        <elem_a>chris</elem_a>
        <elem_a>john</elem_a>
        <elem_b>arkansas</elem_b>
        <elem_c>ice cream</elem_c>
    </red>

    <yellow key="84748346734">
        <car>toyota</car>
        <car>honda</car>
        <cap>football</cap>
    </yellow>
</top>

I am new to Python but created the script below to import the XML file above and that is when I receive the error above. Below is my code.

import xml.etree.ElementTree as ET

myfile = 'C:/Users/user1/Desktop/file.xml'

tree = ET.parse(myfile)
root = tree.getroot()

for x in root.findall('blue'):
    animal = x.find('animal').text
    key1 = x.attrib['key']
    state = x.find('state').text
    zoo = x.find('zoo').text
    year = x.find('year').text
    print animal, key1, state, zoo, year

for y in root.findall('red'):
    elem_a = y.find('elem_a').text
    key2 = y.attrib['key']
    elem_b = y.find('elem_b').text
    elem_c = y.find('elem_c').text
    elem_d = y.find('elem_d').text
    print elem_a, key2, elem_b, elem_c, elem_d

for z in root.findall('yellow'):
    car = z.find('car').text
    key3 = z.attrib['key']
    cap = z.find('cap').text
    cat = z.find('cat').text
    print car, key3, cap, cat

In the XML file there are three main element types: blue, red and yellow. One of the problems specific child elements exist for some parent elements are not for others. For example, in the sample XML file above, one "yellow" element has three child elements including "car", "cat" and "cap" but not each "yellow" element has all three child elements. In the XML below the first "yellow" element has the "cat" child node and the second "yellow" element does not have the "cat" child element but in the full XML file the "yellow" elements could have any one, two or three of the "cat", "cap" and "car" child elements. I know this is causing the error but I do not know how to resolve it. Does anyone have any ideas or tips as to how to resolve this error? Thank you.

3
  • In one of these lines in which you try to access the found element's text, no element was found at all. You can see in the error's traceback what was the exact line of the error. The element you searched for in that line does not exist. Commented Oct 31, 2016 at 19:13
  • 1
    This line probably causes the error : elem_c = y.find('elem_c').text - there is no elem_c in the red tag. Commented Oct 31, 2016 at 19:31
  • Also the lines elem_d = y.find('elem_d').text and cat = z.find('cat').text will give you the same error. Commented Oct 31, 2016 at 21:30

1 Answer 1

4

You can go through the tree, for x in root: goes through the root tags blue, red and yellow, then for every color tag you can loop again for the subtree.

  • x.tag tag-name of an element.
  • x.attrib a map with attributes of an element.
  • x.getchildren() is a list of all the children elements of an element.
  • x.text is the text content of an element.

An example:

import xml.etree.ElementTree as ET

my_file = 'C:/Users/user1/Desktop/file.xml'

tree = ET.parse(my_file)
root = tree.getroot()

def print_subtree(subtree):
    for y in subtree:
        print "\t", y.tag, ":", y.text

for x in root:
    print x.tag, x.attrib
    print_subtree(x.getchildren())

This works fine with a two level tree, for a n-level tree recursion would be necessary.

Sign up to request clarification or add additional context in comments.

2 Comments

You should check the flow of the program with a debugger and examine the various elements, it is good for learning, PyCharm has a good debugger and there is a free community version of it.
If you are satisfied with the answer you can close your question by picking up my answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.