data.xml
<?xml version="1.0" encoding="UTF-8"?>
<ArticleSet>
<Article>
<LastName>Bojarski</LastName>
<ForeName>-</ForeName>
<Affiliation>-</Affiliation>
</Article>
<Article>
<LastName>Genç</LastName>
<ForeName>Yasemin</ForeName>
<Affiliation>fgjfgnfgn</Affiliation>
</Article>
</ArticleSet>
SAMPLE CODE
from lxml import etree
dom = etree.parse('data.xml')
root = dom.getroot()
for article in dom.xpath('Article[Affiliation="-"]'):
root.remove(article)
dom.write('output.xml')
This code deletes articles whose Affiliation is equal to - i.e. whose affiliation tag looks like <Affliation>-</Affliation>
when I store the remaining output into output.xml it parses the Unicode character Genç to Genç I want to store it as it is.
Code's output
<ArticleSet>
<Article>
<LastName>Genç</LastName>
<ForeName>Yasemin</ForeName>
<Affiliation>fgjfgnfgn</Affiliation>
</Article>
</ArticleSet>
Required output
<ArticleSet>
<Article>
<LastName>Genç</LastName>
<ForeName>Yasemin</ForeName>
<Affiliation>fgjfgnfgn</Affiliation>
</Article>
</ArticleSet>