9

I need to traverse the XML tree to add sub element when the value is less than 5. For example, this XML can be modified into

<?xml version="1.0" encoding="UTF-8"?>
<A value="45">
    <B value="30">
        <C value="10"/>
        <C value ="20"/>
    </B>
    <B value="15">
        <C value = "5" />
        <C value = "10" />
    </B>
</A>

this XML.

<?xml version="1.0" encoding="UTF-8"?>
<A value="45">
    <B value="30">
        <C value="10"/>               
        <C value ="20"/>
    </B>
    <B value="15">
        <C value = "5"><D name="error"/></C>
        <C value = "10" />
    </B>
</A>

How can I do that with Python's ElementTree?

2
  • related: stackoverflow.com/questions/4788958/… Commented Jan 25, 2011 at 3:12
  • Can there be more than one <D> child? Have you considered the option of adding an "error" attribute to the element with a problem? Commented Jan 25, 2011 at 4:06

2 Answers 2

13

You probably made a typo because in the example, an error element is appended as the child of an element whose value is 10, which is not less than 5. But I think this is the idea:

#!/usr/bin/env python

from xml.etree.ElementTree import fromstring, ElementTree, Element

def validate_node(elem):
    for child in elem.getchildren():
        validate_node(child)
        value = child.attrib.get('value', '')
        if not value.isdigit() or int(value) < 5:
            child.append(Element('D', {'name': 'error'}))

if __name__ == '__main__':
    import sys
    xml = sys.stdin.read() # read XML from standard input
    root = fromstring(xml) # parse into XML element tree
    validate_node(root)
    ElementTree(root).write(sys.stdout, encoding='utf-8')
            # write resulting XML to standard output

Given this input:

<?xml version="1.0" encoding="UTF-8"?>
<A value="45">
    <B value="30">
        <C value="1"/>
        <C value="20"/>
    </B>
    <B value="15">
        <C value="5" />
        <C value="10" />
        <C value="foo" />
    </B>
</A>

This is is the output:

<A value="45">
    <B value="30">
        <C value="1"><D name="error" /></C>
        <C value="20" />
    </B>
    <B value="15">
        <C value="5" />
        <C value="10" />
        <C value="foo"><D name="error" /></C>
    </B>
</A>
Sign up to request clarification or add additional context in comments.

2 Comments

what I'm concerned about is that would an all-depth for loop iterate over the newly added child element here? eg. if the for is done with for node in list(tree.getroot()) and a node is added somewhere while iterating.
So the way to run this is cat file.xml | python script.py? I did and it works but I wonder if there is another way.
2

ElementTree's iter (or getiterator for Python <2.7) willl recursively return all the nodes in a tree, then just test for your condition and create the SubElement:

from xml.etree import ElementTree as ET
tree = ET.parse(input)
for e in tree.getiterator():
    if int(e.get('value')) < 5:
        ET.SubElement(e,'D',dict(name='error'))

1 Comment

will the added element be yielded by the iterator? If so, how could I distinguish between new element and already existing ones?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.