1

I'm trying to get this thing working. I have a XML file and I need to filter the element 'title' using XPath. Afterwards I need to copy everything from under the C element to an external file, but that's not the point right now. I need to get this running using the xml.etree.cElementTree or xml.etree.ElementTree. I have already read a bunch of posts here on stackoverflow and also on other site's and got stuck. Soo.. First the XML structure:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<delivery xmlns="http://url" publicationdate="2013-08-28T09:10:32Z">
    <A>
        <B>
            <C>
                <Cid>XXXXXXXXX</Cid>
                <cref>111111-2222222</cref>
                <D>
                    <E/>
                    <F/>
                    <G/>
                    <H>
                        <Href>XXXXXXXXXXXX</Href>
                        <hcont name="XXXXXX" country="EN"/>
                    </H>
                    <I/>
                    <J/>
                    <K>XXXXXXXXX</K>
                    <oldK>XXXXXXX</oldK>
                    <title>
                        <content lang="en">TITLE</content>
                    </title>
                    <L>
                        <isL>false</isL>
                    </L>
                </D>
                <M>
                    <startTime>2013-08-28T03:00:00Z</startTime>
                    <endTime>2013-08-29T00:58:00Z</endTime>
                </M>
            </C>
        </B>
    </A>
</delivery>

I can't even get to find the Cid element by XPath. The script keeps returning 'None' or [] or just nothing.

import xml.etree.ElementTree as ET

doc = ET.ElementTree(file='short.xml') 
for x in doc.findall('./A/B/C'):
  print x.get('Cid').text

This one returns nothing. How to get this working? How to 'find' even the Cid element?

0

1 Answer 1

3

You should pass namespaces argument to findall():

namespaces = {name_space_name_here: 'http://url'}
for x in doc.findall('./A/B/C', namespaces=namespaces):
    # do smth

Though, that won't work with a default namespace (just xmlns, as in your case).

In this case you can explicitly pass your namespace to the xpath:

for x in tree.findall('.//{%(uri)s}C' % {'uri': 'http://url'}):

Also see:

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.