3

I am new to Python and I am trying to sort XML with some rules.
My example:

<?xml version="1.0"?>
<data>
    <e2 id="3" name="name3">
        <e12 num="num12" desc="desc12"/>
        <e12 num="num12" desc="desc11"/>
        <e11 num="num1" desc="desc1"/>
    </e2>
    <e2 id="2" name="name2">
        <e11 num="num1" desc="desc1"/>
    </e2>
    <e1 id="1" name="name1">
        <e12 num="num12" desc="desc12"/>
        <e11 num="num4" desc="desc4"/>
    </e1>
</data>

my rules are:
1) sort every attribute by name in respective element
2) sort elements
* by tag name (if no attributes)
* if tag name same by their attribute order

in my case i need to sort first e1 and then e2,
since i have 2 e2 element i need to sort them by their attribute name respectively, like one has id=2 the second one has id=3 so the order should done by id value.
the desired output XML would look like this :

<?xml version="1.0"?>
<data>
    <e1 id="1" name="name1">
        <e11 desc="desc4" num="num4"/>
        <e12 desc="desc12" num="num12"/>
    </e1>
    <e2 id="2" name="name2">
        <e11 desc="desc1" num="num1"/>
    </e2>
    <e2 id="3" name="name3">
        <e11 num="num1" desc="desc1"/>
        <e12 desc="desc11" num="num12"/>
        <e12 desc="desc12" num="num12"/>
    </e2>
</data>

any advice or idea how to do this ?
Thank you.

2 Answers 2

10

You can sort your XML with ElementTree. In my example I sort it first by the tag-name and second by the value of the attribut 'name' and the child elements by tag-name and the value of the attribut 'desc'

import xml.etree.ElementTree as ET
tree = ET.ElementTree(ET.fromstring(xmlstr))
root = tree.getroot()

# sort the first layer
root[:] = sorted(root, key=lambda child: (child.tag,child.get('name')))

# sort the second layer
for c in root:
    c[:] = sorted(c, key=lambda child: (child.tag,child.get('desc')))

xmlstr = ET.tostring(root, encoding="utf-8", method="xml")
print(xmlstr.decode("utf-8"))

this prints

<data>
<e1 id="1" name="name1">
    <e11 desc="desc4" num="num4" />
    <e12 desc="desc12" num="num12" />
</e1>
<e2 id="2" name="name2">
    <e11 desc="desc1" num="num1" />
</e2>
<e2 id="3" name="name3">
    <e11 desc="desc1" num="num1" />
    <e12 desc="desc11" num="num12" />
    <e12 desc="desc12" num="num12" />
</e2>
</data>
Sign up to request clarification or add additional context in comments.

1 Comment

thank you... I can work on this example for dynamic sorting. Because the attributes can change for same type of elements.
4

The solution with xml.etree.ElementTree object:

import xml.etree.ElementTree as ET

tree = ET.parse('input.xml')
data = tree.getroot()
els = data.findall("*[@id]")   # all e<number> elements having `id` attribute
new_els = sorted(els, key=lambda el: (el.tag, el.attrib['id']))
for el in new_els:
    el[:] = sorted(el, key=lambda e: (e.tag, e.attrib['desc']))
data[:] = new_els

tree.write('result.xml', xml_declaration=True, encoding='utf-8')

The final result.xml contents:

<?xml version='1.0' encoding='utf-8'?>
<data>
    <e1 id="1" name="name1">
        <e11 desc="desc4" num="num4" />
    <e12 desc="desc12" num="num12" />
        </e1>
<e2 id="2" name="name2">
        <e11 desc="desc1" num="num1" />
    </e2>
    <e2 id="3" name="name3">
        <e11 desc="desc1" num="num1" />
    <e12 desc="desc11" num="num12" />
        <e12 desc="desc12" num="num12" />
        </e2>
    </data>

1 Comment

thank you this is also works fine, both answers are correct.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.