0

I have an input XML coming with some wrong namespaces. I tried to fix them with ElementTree but without success

Example input: (here ns0: can be ns:, p:, n: etc etc... )

<ns0:Invoice xmlns:ns0="http://invoices.com/docs/xsd/invoices/v1.2" version="FPR12">

  <InvoiceHeader>
    <DataH>data header</DataH>
  </InvoiceHeader>

  <InvoiceBody>
    <DataB>data body</DataB>
  </InvoiceBody>

</ns0:Invoice>

Output file needed: (namespace in the root must be without prefix and some inner tags declared as xmlns="")

<Invoice xmlns:"http://invoices.com/docs/xsd/invoices/v1.2" version="FPR12">

  <InvoiceHeader xmlns="">
    <DataH>data header</DataH>
  </InvoiceHeader>

  <InvoiceBody xmlns="">
    <DataB>data body</DataB>
  </InvoiceBody>

</Invoice>

I tried to change root namespace as below, but the resulting file is unchanged

import xml.etree.ElementTree as ET

tree = ET.parse('./cache/test.xml')
root = tree.getroot()

root.tag = '{http://invoices.com/docs/xsd/invoices/v1.2}Invoice'
xml = ET.tostring(root, encoding="unicode")
with open('./cache/output.xml', 'wt') as f:
    f.write(xml)

Instead when trying with

changing root.tag  = 'Invoice'

it produces a tag without namespace at all

Please let me know whether I'm making any mistake or I should switch to another library or try with a string replace with regex

Thanks in advance

1 Answer 1

1

Don't now if it can be useful to anyone but I managed to fix namespaces using lxml and the following code.

from lxml import etree
from copy import deepcopy

tree = etree.parse('./cache/test.xml')

# create a new root without prefix in the namespace
NSMAP = {None : "http://invoices.com/docs/xsd/invoices/v1.2"}
root = etree.Element("{http://invoices.com/docs/xsd/invoices/v1.2}Invoice", nsmap = NSMAP)

# copy attributes from original root
for attr, value in tree.getroot().items():
    root.set(attr,value)

# deep copy of children (adding empty namespace in some tags)
for child in tree.getroot().getchildren():
    if child.tag in( 'InvoiceHeader', 'InvoiceBody'):
        child.set("xmlns","")
    root.append( deepcopy(child) )

xml = etree.tostring(root, pretty_print=True)
with open('./cache/output.xml', 'wb') as f:
    f.write(xml)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.