1

I am trying to read the following XML file which has following content:

<tu creationdate="20100624T160543Z" creationid="SYSTEM" usagecount="0">
    <prop type="x-source-tags">1=A,2=B</prop>
    <prop type="x-target-tags">1=A,2=B</prop>
    <tuv xml:lang="EN">
      <seg>Modified <ut x="1"/>Denver<ut x="2"/> Score</seg>
    </tuv>
    <tuv xml:lang="DE">
      <seg>Modifizierter <ut x="1"/>Denver<ut x="2"/>-Score</seg>
    </tuv>
  </tu>

using the following code

tree = ET.parse(tmx)
root = tree.getroot()
seg = root.findall('.//seg')
for n in seg:
   print(n.text)

It gave the following output:

Modified
Modifizierter

What I am expecting was

Modified Denver Score
Modifizierter Denver -Score

Can someone explain why only part of seg is displayed?

0

2 Answers 2

2

You need to be aware of the tail property, which is the text that follows an element's end tag. It is explained well here: http://infohost.nmt.edu/tcc/help/pubs/pylxml/web/etree-view.html.

"Denver" is the tail of the first <ut> element and " Score" is the tail of the second <ut> element. These strings are not part of the text of the <seg> element.

In addition to the solution provided by kgbplus (which works with both ElementTree and lxml), with lxml you can also use the following methods to get the wanted output:

  1. xpath()

    for n in seg:
        print("".join(n.xpath("text()")))
    
  2. itertext()

    for n in seg:
        print("".join(n.itertext()))
    
Sign up to request clarification or add additional context in comments.

Comments

1

You can use tostring function:

tree = ET.parse(tmx)
root = tree.getroot()
seg = root.findall('.//seg')
for n in seg:
   print(ET.tostring(n, method="text"))

In your case resulting string may contain unnecessary symbols, so you can modify last line like this:

print(ET.tostring(n, method="text").strip())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.