Parsing XML with namespaces using ElementTree in Python

Question

I have an xml, small part of it looks like this:

<?xml version="1.0" ?>
<i:insert xmlns:i="urn:com:xml:insert" xmlns="urn:com:xml:data">
  <data>
    <image imageId="1"></image>
    <content>Content</content>
  </data>
</i:insert>

When i parse it using ElementTree and save it to a file i see following:

<ns0:insert xmlns:ns0="urn:com:xml:insert" xmlns:ns1="urn:com:xml:data">
  <ns1:data>
    <ns1:image imageId="1"></ns1:image>
    <ns1:content>Content</ns1:content>
  </ns1:data>
</ns0:insert>

Why does it change prefixes and put them everywhere? Using minidom i don't have such problem. Is it configured? Documentation for ElementTree is very poor. The problem is, that i can't find any node after such parsing, for example image - can't find it with or without namespace if i use it like {namespace}image or just image. Why's that? Any suggestions are strongly appreciated.

What i already tried:

import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
for a in root.findall('ns1:image'):
    print a.attrib

This returns an error and the other one returns nothing:

for a in root.findall('{urn:com:xml:data}image'):
    print a.attrib

I also tried to make namespace like this and use it:

namespaces = {'ns1': 'urn:com:xml:data'}
for a in root.findall('ns1:image', namespaces):
    print a.attrib

It returns nothing. What am i doing wrong?

Can you add the Python code which you are using to parse the XML? — Kayce Basques
– Kayce Basques, Commented Jan 10, 2015 at 0:25

mzjn · Accepted Answer · 2015-01-10 10:05:54Z

7

+50

This snippet from your question,

for a in root.findall('{urn:com:xml:data}image'):
    print a.attrib

does not output anything because it only looks for direct {urn:com:xml:data}image children of the root of the tree.

This slightly modified code,

for a in root.findall('.//{urn:com:xml:data}image'):
    print a.attrib

will print {'imageId': '1'} because it uses .//, which selects matching subelements on all levels.

Reference: https://docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax.

It is a bit annoying that ElementTree does not just retain the original namespace prefixes by default, but keep in mind that it is not the prefixes that matter anyway. The register_namespace() function can be used to set the wanted prefix when serializing the XML. The function does not have any effect on parsing or searching.

answered Jan 10, 2015 at 10:05

mzjn

51.5k16 gold badges139 silver badges265 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Sophie · Accepted Answer · 2015-01-08 08:07:18Z

0

From what I gather, it has something to do with the namespace recognition in ET.

from here http://effbot.org/zone/element-namespaces.htm

When you save an Element tree to XML, the standard Element serializer generates unique prefixes for all URI:s that appear in the tree. The prefixes usually have the form “ns” followed by a number. For example, the above elements might be serialized with the prefix ns0 for “http://www.w3.org/1999/xhtml” and ns1 for “http://effbot.org/namespace/letters”.

If you want to use specific prefixes, you can add prefix/uri mappings to a global table in the ElementTree module. In 1.3 and later, you do this by calling the register_namespace function. In earlier versions, you can access the internal table directly:

ElementTree 1.3

ET.register_namespace(prefix, uri)

ElementTree 1.2 (Python 2.5)

ET._namespace_map[uri] = prefix

Note the argument order; the function takes the prefix first, while the raw dictionary maps from URI:s to prefixes.

answered Jan 8, 2015 at 8:07

Sophie

5051 gold badge4 silver badges9 bronze badges

1 Comment

midori Over a year ago

i already read it and tried this namespace registration but it didn't help.

Collectives™ on Stack Overflow

Parsing XML with namespaces using ElementTree in Python

2 Answers 2

Comments

ElementTree 1.3

ElementTree 1.2 (Python 2.5)

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

ElementTree 1.3

ElementTree 1.2 (Python 2.5)

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related