0

I have code like this for removing all tags named sysId from xml in root_compare function:

   #removing sysId from comparison
    for rm1 in xml_root1.findall('.//sysId'):
    xml_root1.remove(rm1)

The code gives me this error:

 File "/tmp/dev_uac_api2/uac_api_lib.py", line 105, in root_compare
    xml_root1.remove(rm1)
 File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 337, in remove
    self._children.remove(element)
  ValueError: list.remove(x): x not in list

I need to go through all elements in xml even child, grandchild and remove the ones called sysId. Can you help me solve this problem?

the xml structure is something like:

<root>
    <sysId></sysId>
    <b></b>
    <c>
        <sysId></sysId>
    </c>
    <d>
        <e>
            <sysId></sysId>
        </e>
    </d>
</root>

1 Answer 1

1

Removing elements is a little more work in ElementTree than it is in lxml because lxml has the getparent() function.

In ElementTree, you first need to match the parent of the element to remove.

ElementTree's xpath support isn't great either so .//*[sysId] isn't going to match the first sysId element since it's a direct child of the root element. You'll have to remove those separately.

Example...

import xml.etree.ElementTree as ET

xml = """<root>
    <sysId></sysId>
    <b></b>
    <c>
        <sysId></sysId>
    </c>
    <d>
        <e>
            <sysId></sysId>
        </e>
    </d>
</root>"""

root = ET.fromstring(xml)

# find/remove direct "sysId" children of root
for child in root.findall("sysId"):
    root.remove(child)

# find elements that contain a "sysId" child element
for parent in root.findall(".//*[sysId]"):
    # find/remove direct "sysId" children of parent
    for child in parent.findall("sysId"):
        parent.remove(child)

print ET.tostring(root)

Printed output...

<root>
    <b />
    <c>
        </c>
    <d>
        <e>
            </e>
    </d>
</root>

Here's an example of lxml to show the difference (same printed output as above)...

from lxml import etree

xml = """<root>
    <sysId></sysId>
    <b></b>
    <c>
        <sysId></sysId>
    </c>
    <d>
        <e>
            <sysId></sysId>
        </e>
    </d>
</root>"""

root = etree.fromstring(xml)

for elem in root.xpath(".//sysId"):
    elem.getparent().remove(elem)

print etree.tostring(root)
Sign up to request clarification or add additional context in comments.

1 Comment

Great! helped a lot. Thank you @Daniel Haley

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.