3

Suppose I have an XML string:

<A>
    <B foo="123">
        <C>thing</C>
        <D>stuff</D>
    </B>
</A>

and I want to insert a namespace of the type used by XML Schema, putting a prefix in front of all the element names.

<A xmlns:ns1="www.example.com">
    <ns1:B foo="123">
        <ns1:C>thing</ns1:C>
        <ns1:D>stuff</ns1:D>
    </ns1:B>
</A>

Is there a way to do this (aside from brute-force find-replace or regex) using lxml.etree or a similar library?

3
  • There is no prefix on the A element in the wanted output. Typo? Commented Aug 6, 2015 at 20:19
  • @mzjn: Is there supposed to be a namespace prefix on the root element too? The code I'm working on has none (and does not complain) but I could definitely believe it. Commented Aug 6, 2015 at 20:32
  • 1
    You said "putting a prefix in front of all the element names", so I had to ask. If you don't want a prefix on A, that's fine. Commented Aug 6, 2015 at 20:36

3 Answers 3

4

I don't think this can be done with just ElementTree.

Manipulating namespaces is sometimes surprisingly tricky. There are many questions about it here on SO. Even with the more advanced lxml library, it can be really hard. See these related questions:

Below is a solution that uses XSLT.

Code:

from lxml import etree

XML = '''
<A>
    <B foo="123">
        <C>thing</C>
        <D>stuff</D>
    </B>
</A>'''

XSLT = '''
<xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:ns1="www.example.com">
 <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>

  <xsl:template match="*">
   <xsl:element name="ns1:{name()}">
    <xsl:apply-templates select="node()|@*"/>
   </xsl:element>
  </xsl:template>

  <!-- No prefix on the A element -->
  <xsl:template match="A">
   <A xmlns:ns1="www.example.com">
    <xsl:apply-templates select="node()|@*"/>
   </A>
  </xsl:template>
</xsl:stylesheet>'''

xml_doc = etree.fromstring(XML)
xslt_doc = etree.fromstring(XSLT)
transform = etree.XSLT(xslt_doc)
print transform(xml_doc)

Output:

<A xmlns:ns1="www.example.com">
    <ns1:B foo="123">
        <ns1:C>thing</ns1:C>
        <ns1:D>stuff</ns1:D>
    </ns1:B>
</A>
Sign up to request clarification or add additional context in comments.

Comments

1

Use ET.register_namespace('ns1', 'www.example.com') to register the namespace with ElementTree. This is needed so write() uses the registered prefix. (I have code that uses a prefix of '' (an empty string) for the default namespace)

Then prefix each element name with {www.example.com}. For example: root.find('{www.example.com}B').

2 Comments

I think the OP wants to convert a bare <B> element into an <ns1:B> element. I don't think this is going to be possible without iteratively creating new elements and copying over content and attributes from the original tree.
The first part sounds promising, but is there a way to prefix each element name with {www.example.com} without walking the tree and adding it manually?
1
import xml.etree.ElementTree as ET

name_space = {
# namespace defined below
"xmlns:ns1":"www.example.com""="www.example.com"
}

A = ET.Element('A', name_space)
B = ET.SubElement(A, 'ns1:B')
C = ET.SubElement(B, 'ns1:C')
C.text = 'thing'

You can pass the namespaces to the Element constructor, at that point you can reference if at any child component. note you can define more than one. this solution worked for me.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.