0

I have the following XML.

I am using ElementTree library to scrape the values.

<?xml version="1.0" encoding="UTF-8"?>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
 <url>    
  <loc> Test1</loc>
  </url>
 <url>
  <loc>Test 2</loc>
 </url>
 <url>
  <loc>Test 3</loc>
 </url>
</urlset>

I need to get the values out of 'loc tag'.

Desired Output:

Test 1
Test 2
Test 3

Tried Code:

tree = ET.parse('sitemap.xml')
root = tree.getroot()
for atype in root.findall('url'):
 rank = atype.find('loc').text
print (rank)

Any suggestions on where am I wrong ?

0

2 Answers 2

2

Your XML has a default namespace (http://www.sitemaps.org/schemas/sitemap/0.9) so you either have to address all your tags as:

tree = ET.parse('sitemap.xml')
root = tree.getroot()
for atype in root.findall('{http://www.sitemaps.org/schemas/sitemap/0.9}url'):
    rank = atype.find('{http://www.sitemaps.org/schemas/sitemap/0.9}loc').text
    print(rank)

Or to define a namespace map:

nsmap = {"ns": "http://www.sitemaps.org/schemas/sitemap/0.9"}

tree = ET.parse('sitemap.xml')
root = tree.getroot()
for atype in root.findall('ns:url', nsmap):
    rank = atype.find('ns:loc', nsmap).text
    print(rank)
Sign up to request clarification or add additional context in comments.

1 Comment

Yeah. I nearby forgot about it. Thanks for noticing it out. Yeah. I addressed that. Thanks for pointing it out.
0
from lxml import etree


tree = etree.parse('sitemap.xml')
    for element in tree.iter('*'):
        if element.text.find('Test') != -1:
            print element.text

Probably isn't the most beautiful solution, but it works :)

2 Comments

Are we searching for the text which is inside the <loc> tag ?
It will check all elements in sitemap.xml, so URLSET element URL element (<url> ) LOC element (<loc>) -> text found -> text got printed. URL element (<url>) LOC element -> text found -> text got printed. etc.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.