I have written a code to remove countries of those ranks which are not present in list lis from tes.xml and generating updated xml output.xml after removing the countries. But those countries are also coming in output which are not there in the list
XML:
tes.xml
<?xml version="1.0"?>
<data>
<continents>
<country>
<state>
<rank updated="yes">123456</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</state>
<zones>
<pretty>yes</pretty>
</zones>
</country>
<country>
<state>
<rank updated="yes">789045</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<gpc>59900</gpc>
<neighbor name="Malaysia" direction="N"/>
</state>
<zones>
<pretty>No</pretty>
</zones>
<market>
<pretty>cool</pretty>
</market>
</country>
<country>
<state>
<rank updated="yes">67846464</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<gpc>59900</gpc>
<neighbor name="Malaysia" direction="N"/>
</state>
<zones>
<pretty>No</pretty>
</zones>
<market>
<pretty>cool</pretty>
</market>
</country>
</continents>
</data>
code:
import xml.etree.ElementTree as ET
tree = ET.parse('tes.xml')
lis = ["123456"]
root = tree.getroot()
print('root is', root)
print(type(root))
for continent in root.findall('.//continents'):
for country in continent:
rank = country.find('state/rank').text
print(rank)
if rank not in lis:
continent.remove(country)
tree.write('outpu.xml')
console output: It is not even printing all the ranks from XML i.e. 67846464 is skipped so this rank will also be printed in the output.xml though it is not there in the list
root is <Element 'data' at 0x7f5929a9d8b0>
<class 'xml.etree.ElementTree.Element'>
123456
789045
Current output: having 2 ids 123456 and 67846464
<data>
<continents>
<country>
<state>
<rank updated="yes">123456</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E" />
<neighbor name="Switzerland" direction="W" />
</state>
<zones>
<pretty>yes</pretty>
</zones>
</country>
<country>
<state>
<rank updated="yes">67846464</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<gpc>59900</gpc>
<neighbor name="Malaysia" direction="N" />
</state>
<zones>
<pretty>No</pretty>
</zones>
<market>
<pretty>cool</pretty>
</market>
</country>
</continents>
</data>
Expected output: only 123456 should come as 67846464 is not in the list
<data>
<continents>
<country>
<state>
<rank updated="yes">123456</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E" />
<neighbor name="Switzerland" direction="W" />
</state>
<zones>
<pretty>yes</pretty>
</zones>
</country>
</continents>
</data>