0

Basically what I'm trying to do is import an xml file into Python and remove any data where the entityNo is 1111111111.

Here is a text copy of the xml data:

<memberBasedResearchDataImport>
   <surveyDescr>D520</surveyDescr>
   <surveyType>MEG</surveyType>
   <surveyRequester>1543588274</surveyRequester>
   <product>DISC</product>
   <externalRef>PKG_RPTA88425_4</externalRef>
   <DateTimeCreated>20191019 05:10:33</DateTimeCreated>
   <identifierSettings>
       <identifierType id="1" database="DARE" schema="dp_da_crm" table="ratings" column="object_cd" columnType="number"></identifierType>
       <identifierType id="2" database="DARE" schema="dp_da_ent" table="entity" column="full_name" columnType="varchar2"></identifierType>
       <identifierType id="3" database="dual" schema="dual" table="dual" column="dual" columnType="varchar2"></identifierType>
   </identifierSettings>
   <row id="1" entityNo="1054354679" entityRole="KP" policyNo="0" agentEntityNo="1103354880">
       <templateValue name="INTERACTION_DAY" value="Friday"></templateValue>
       <identifierType id="1" value="671535634817"></identifierType>
       <identifierType id="2" value="CUSTOMER SERVICES: SALES"></identifierType>
   </row>
   <row id="2" entityNo="1111111111" entityRole="AP" policyNo="0" agentEntityNo="11351512571">
       <templateValue name="INTERACTION_DAY" value="Friday"></templateValue>
       <identifierType id="1" value="6715354549"></identifierType>
       <identifierType id="2" value="CUSTOMER SERVICES: ADMIN"></identifierType>
   </row>
   <row id="3" entityNo="100000571" entityRole="LP" policyNo="0" agentEntityNo="112355274">
       <templateValue name="INTERACTION_DAY" value="Friday"></templateValue>
       <identifierType id="1" value="671546864"></identifierType>
       <identifierType id="2" value="CUSTOMER SERVICES: SALES"></identifierType>
   </row>
   <row id="4" entityNo="1111111111" entityRole="HP" policyNo="0" agentEntityNo="112456466850"><templateValue name="INTERACTION_DAY" value="Friday"></templateValue>
       <identifierType id="1" value="6793437110"></identifierType>
       <identifierType id="2" value="CUSTOMER SERVICES: RETURNS"></identifierType>
   </row>
</memberBasedResearchDataImport>

So far I have tried a few solutions that I have found online but with no success. The code below is what I found in another post but doesn't remove the data I need it to remove. My code is below and any help would be highly appreciated. Again, I need to delete the data where the entityNo = 1111111111 and then export the data in xml format.

from xml.etree.ElementTree import ElementTree

path_to_xml_file = "C:\Users\username\Documents\Data_File.xml"

tree = ElementTree()
tree.parse(path_to_xml_file)

foos = tree.findall("entityNo")
for foo in foos:
  bars = foo.find("1111111111")
  for bar in bars:
    foo.remove(bar)

tree.write("C:\Users\username\Documents\Data_File.xml")

3 Answers 3

1

Instead of trying to find all "entityNo", loop through the rows, see if the attribute is 11111 if yes, remove it. Something like this:

root = tree.getroot()
for row in root.findall('row'):
    if row.attrib['entityNo'] == "1111111111":
        root.remove(row)
Sign up to request clarification or add additional context in comments.

2 Comments

That's not gonna work without actually saving changes back to the file
@Madi7, he already has the code for that....I just modified his iteration code for him.
1

Here you go

import xml.etree.ElementTree as ET

path_to_xml_file = "C:\Users\username\Documents\Data_File.xml"


root=ET.parse(path_to_xml_file)

for country in root.findall('row'):
    val_to_delete = country.attrib['entityNo']
    if val_to_delete == 1111111111:
        root.remove(country)

root.write("C:\Users\username\Documents\Data_File.xml")

There are some mistakes in your original code

  1. your import statement is wrong. Pleas find my code to see it corrected
  2. your finding of the attribute you must access the attribute by using .attrib[] as in my snipet
  3. and the mostimportant when you are iterating over a for loop if you are making any updates, say for like remove in your case it should be to the original value and not to the iterator object ie. in your code anychange made should be made to foos not to foo. foo is just a copy

Hope this helps..

Comments

1

Try this one:

import xml.etree.ElementTree as ET


file = 'C:\Users\username\Documents\Data_File.xml'
case = '1111111111'

element = ET.parse(file)
root = element.getroot()

for child in root:
    if child.attrib.get('entityNo') == case:
        root.remove(child)

element.write(file)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.