1

Im triying to remove some SOAP envelope tags from this XML document:

<S:Envelope xmlns:S="http://www.w3.org/2003/05/soap-envelope">
  <S:Body>
    <ns2:promediosSipsaCityResponse xmlns:ns2="http://servicios.sipsa.co.gov.dane/">
      <return>
        <city>BARRANQUILLA</city>
        <codProduct>1</codProduct>
        <send>0</send>
        <releaseDate>2020-03-18T00:00:00-05:00</releaseDate>
        <creationDate>2020-03-18T14:00:01-05:00</creationDate>
        <price>632</price>
        <product>Ahuyama</product>
        <regId>316989</regId>
      </return>
      <return>
        <city>BARRANQUILLA</city>
        <codProduct>2</codProduct>
        <send>0</send>
        <releaseDate>2020-03-18T00:00:00-05:00</releaseDate>
        <creationDate>2020-03-18T14:00:01-05:00</creationDate>
        <price>7733</price>
        <product>Arveja verde en vaina</product>
        <regId>316990</regId>
      </return>
    </ns2:promediosSipsaCiudadResponse>
  </S:Body>
</S:Envelope>

So it would look like this:

<return>
 <city>BARRANQUILLA</city>
 <codProduct>1</codProduct>
 <send>0</send>
 <releaseDate>2020-03-18T00:00:00-05:00</releaseDate>
 <creationDate>2020-03-18T14:00:01-05:00</creationDate>
 <price>632</price>
 <product>Ahuyama</product>
 <regId>316989</regId>
</return>
<return>
 <city>BARRANQUILLA</city>
 <codProduct>2</codProduct>
 <send>0</send>
 <releaseDate>2020-03-18T00:00:00-05:00</releaseDate>
 <creationDate>2020-03-18T14:00:01-05:00</creationDate>
 <price>7733</price>
 <product>Arveja verde en vaina</product>
 <regId>316990</regId>
</return>
    

I tried to use ElementTree library to navigate through the elements and just get the return childrens but its not working:

doc = etree.parse('result.xml')
for ele in doc.findall('//return'):
    parent = ele.getparent()
    print(parent)
    parent.remove()
doc.write('result2.xml', pretty_print=True)

Any feedback is welcome, thanks!

2
  • 2
    XSLT is made for such tasks. Are you open to it? Commented Sep 7, 2022 at 3:45
  • 3
    A well-formed XML suppose to have a root tag. Commented Sep 7, 2022 at 3:47

1 Answer 1

1

Instead of modifying your original file, I would just create a new one and copy the relevant portions into it.

Notes: as mentioned in the comments, you need a root element for well formed xml. Also, the original xml in your question is not well formed (the opening ns2:promediosSipsaCityResponse doesn't match its closing). But assuming these are fixed, you can do what you want with either ElementTree or lxml:

old = """<S:Envelope xmlns:S="http://www.w3.org/2003/05/soap-envelope">
  <S:Body>
    <ns2:promediosSipsaCityResponse 
    [.... rest of your xml above...]
    </ns2:promediosSipsaCityResponse>
  </S:Body>
</S:Envelope>
"""
new = """<someroot></someroot>"""

With ElementTree:

old_doc = ET.fromstring(old)
new_doc = ET.fromstring(new)

for ret in old_doc.findall('.//return'):
    new_doc.insert(1,ret)
print(ET.tostring(new_doc2).decode())

Similarly, with lxml:

old_doc = etree.XML(old)
new_doc = etree.XML(new)

for ret in old_doc.xpath('//return'):
    new_doc.insert(1,ret)
print(etree.tostring(new_doc).decode())

The output should be your expected output.

Sign up to request clarification or add additional context in comments.

2 Comments

Hey Jack, thanks for the answer, the XML root node is missing since it's a response from a SOAP petition, so i had to add it manually.
@CZarate29 Glad it worked for you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.