Parsing XML file using element tree python3

Question

I want to parse an XML file fragment given below to extract the viewpoint tag and its attribute names. I also want to create a table to tabulate the extracted data.

My XML file fragment:

 <windows source-height='51'>
        <window class='dashboard' maximized='true' name='Figure 8-59'>
          <viewpoints>
            <viewpoint name='Good Filter Design'>
              <zoom type='entire-view' />
              <geo-search-visibility value='1' />
            </viewpoint>
            <viewpoint name='Poor Filter Design'>
              <zoom type='entire-view' />
              <geo-search-visibility value='1' />
            </viewpoint>
          </viewpoints>
          <active id='-1' />
        </window>
        <window class='dashboard' name='Figure 8-60 thought 8-65'>
          <viewpoints>
            <viewpoint name='Heat Map'>
              <zoom type='entire-view' />
              <geo-search-visibility value='1' />
            </viewpoint>
            <viewpoint name='Lightbulb'>
              <zoom type='entire-view' />
              <geo-search-visibility value='1' />
            </viewpoint>
            <viewpoint name='Sales Histogram'>
              <zoom type='entire-view' />
              <geo-search-visibility value='1' />
            </viewpoint>
          </viewpoints>
          <active id='-1' />
        </window>
</windows>

I want to extract and keep the "good filter design" and "poor filter design" in one row and the remaining three view point names as a second row.

My attempt:

root = getroot('example.xml')
for i in root.findall('windows/window/viewpoints/viewpoint'):
    print(i.get('name'))

Use Beautifulsoup.

Rahul
– Rahul

2018-02-17 11:19:03 +00:00
Commented Feb 17, 2018 at 11:19 — Rahul
– Rahul, Commented Feb 17, 2018 at 11:19

har07 · Accepted Answer · 2018-02-17 13:55:17Z

1

Using elementtree should be as easy. I don't know what getroot() do exactly, but if it really return root element of the XML document, then you shouldn't mention window in the findall parameter :

>>> from xml.etree import ElementTree as ET
>>> raw = '''your XML string'''
>>> root = ET.fromstring(raw)
>>> for v in root.findall('window/viewpoints'):
...     print([a.get('name') for a in v.findall('viewpoint')])
... 
['Good Filter Design', 'Poor Filter Design']
['Heat Map', 'Lightbulb', 'Sales Histogram']

demo

answered Feb 17, 2018 at 13:55

har07

89.5k12 gold badges87 silver badges143 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

frank hk Over a year ago

how can i add 1st two values in a list and next three values 'heatmap , lightbulb, sales histogram' in another list ?

frank hk Over a year ago

tq @har07 for your quick response

frank hk Over a year ago

List that u defined is static, so in my problem this list order as well as numbers may differ. Any idea how to generate list dynamically.? @har07

Rahul · Accepted Answer · 2018-02-17 12:05:20Z

0

If you can use beautifulsoup thismuch easy it is

from bs4 import BeautifulSoup
#xml = """your xml"""
soup = BeautifulSoup(xml, 'lxml')
names = [viewpt["name"] for viewpt in soup.find_all('viewpoint')]

This will give every tag named 'viewpoint'

If you only want nested one use this:

names = [viewpoint["name"]
        for windows in soup.find_all('windows')
            for window in windows.find_all("window")
                for viewpoints in window.find_all("viewpoints")
                    for viewpoint in viewpoints.find_all("viewpoint")]

in your case both will give:

Out[18]: 
['Good Filter Design',
 'Poor Filter Design',
 'Heat Map',
 'Lightbulb',
 'Sales Histogram']

edited Feb 17, 2018 at 12:05

answered Feb 17, 2018 at 11:24

Rahul

11.7k5 gold badges63 silver badges100 bronze badges

Collectives™ on Stack Overflow

Parsing XML file using element tree python3

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related