1

I am having trouble parsing XML when it is in the form of:

<Cars>
    <Car>
        <Color>Blue</Color>
        <Make>Ford</Make>
        <Model>Mustant</Model>
    </Car>
    <Car>
        <Color>Red</Color>
        <Make>Chevy</Make>
        <Model>Camaro</Model>
    </Car>
</Cars>

I have figured out how to parse 1st level children like this:

<Car>
    <Color>Blue</Color>
    <Make>Chevy</Make>
    <Model>Camaro</Model>
</Car>

With this kind of code:

from lxml import etree
    a = os.path.join(localPath,file)
    element = etree.parse(a)
    cars = element.xpath('//Root/Foo/Bar/Car/node()[text()]')
    parsedCars = [{field.tag: field.text for field in cars} for action in cars]
    print parsedCars[0]['Make'] #Chevy

How can I parse our multiple "Car" tags that is a child tag of "Cars"?

1 Answer 1

3

Try this

from lxml import etree
    a = os.path.join(localPath,file)
    element = etree.parse(a)
    cars = element.xpath('//Root/Foo/Bar/Car')
    for car in cars:
        colors = car.xpath('./Color')
        makes = car.xpath('./Make')
        models = car.xpath('./Model')
Sign up to request clarification or add additional context in comments.

4 Comments

When I run this code to find Color I get the address and not the actual object. For example, when trying to find color I get [<Element Color at 0x2a9f0f8>]
They return the element object. To get the text use the xpath './Color/text()'
Yea I actually figured it out - but used './Color/node()' instead. What is the different between the two - they both give me the text.
node() select all node, text() select only text node. In this instance, there are only text nodes so they perform the same.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.