Finding xml text content from tag name in python

Question

I certain this type of question has been asked before, but I can't seem to get the right set of words to find the answer myself...

I've got an XML file, for example

<document>
   <page>
      <title>title1</title>
      <id>1</id>
      <text>this is text1</text>
   </page>
   <page>
      <title>title2</title>
      <id>2</id>
      <text>this is text2</text>
   </page>
   <page>
      <title>title3</title>
      <id>3</id>
      <comment>random comment</comment>
      <text>this is text3</text>
   </page>
</document>

I am trying to find a way to, ideally, store each values within tags into an array.

Now I had originally tried just printing everything with the code below, but that only worked until the time where there is the random tag which throws off the indexing. So, is there a way to simple get the text from tag? Or is there an absolute need to know the array index?

import xml.etree.ElementTree as ET
tree = ET.parse('./xml_file.xml')
root = tree.getroot()

for child in root:
    print(child[2].text)

I apologies if this is common question, I really couldn't figure out any answers online.

Benjamin Loison · Accepted Answer · 2025-05-29 13:44:22Z

7

import xml.etree.ElementTree as ET
tree = ET.parse('./all_foods.xml')
my_text = [item.text for item in tree.iter()]

This will give you list of text that you want. If you want some specific text you can use

my_tags = [item.text for item in tree.iter() if item.text == "title1"]

edited May 29 at 13:44

Benjamin Loison

5,7514 gold badges20 silver badges37 bronze badges

answered Dec 11, 2016 at 18:37

nick_gabpe

5,9437 gold badges32 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Sachin Sridhar Over a year ago

You saved my life! How to find value using the tag. For ex: id, I need the id value.

malat · Accepted Answer · 2025-04-07 15:32:46Z

4

Since from your question it sounds like you're looking to get a specific key, you can simple use find(<key_name>).text to get the contents of the XML key with that name

import xml.etree.ElementTree as ET
tree = ET.parse('./all_foods.xml')
root = tree.getroot()
for x in root:
    print(x.find("title").text)

>>>
   title1
   title2
   title3

edited Apr 7 at 15:32

malat

12.7k15 gold badges103 silver badges200 bronze badges

answered Dec 11, 2016 at 18:36

yampelo

5224 silver badges19 bronze badges

3 Comments

Ronan Over a year ago

I'm not sure why, but this is giving me "TypeError: 'ElementTree' object is not iterable." Regardless, thanks for the help I will look into this find function. In the mean time I got an alternate answer from nick_gabpe which works. I do however appreciate this, more to learn is always great.

yampelo Over a year ago

Sorry, it should be for x in root not for x in tree, it'll work if you change that

ch4rl1e97 Over a year ago

Does this still work? I appear to have an issue that x.find(string) returns None on items that don't match the search string, which then causes an error when trying to get the .text attribute. Thus it just crashes. Can't think of a nice way around this without having a massive sequences of try:except: blocks (I'm needing to access several different named tags in a single file)

Benjamin Loison · Accepted Answer · 2025-05-29 13:45:36Z

0

You can also use pandas read_xml():

import pandas as pd

xml_="""<document>
   <page>
      <title>title1</title>
      <id>1</id>
      <text>this is text1</text>
   </page>
   <page>
      <title>title2</title>
      <id>2</id>
      <text>this is text2</text>
   </page>
   <page>
      <title>title3</title>
      <id>3</id>
      <comment>random comment</comment>
      <text>this is text3</text>
   </page>
</document>"""

df = pd.read_xml(xml_, xpath="page")
print(df.to_string())

Output:

    title  id           text         comment
0  title1   1  this is text1            None
1  title2   2  this is text2            None
2  title3   3  this is text3  random comment

edited May 29 at 13:45

Benjamin Loison

5,7514 gold badges20 silver badges37 bronze badges

answered Apr 9 at 16:47

Hermann12

4,1282 gold badges8 silver badges21 bronze badges

Collectives™ on Stack Overflow

Finding xml text content from tag name in python

3 Answers 3

1 Comment

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related