1

I want to parse an XML file fragment given below to extract the viewpoint tag and its attribute names. I also want to create a table to tabulate the extracted data.

My XML file fragment:

 <windows source-height='51'>
        <window class='dashboard' maximized='true' name='Figure 8-59'>
          <viewpoints>
            <viewpoint name='Good Filter Design'>
              <zoom type='entire-view' />
              <geo-search-visibility value='1' />
            </viewpoint>
            <viewpoint name='Poor Filter Design'>
              <zoom type='entire-view' />
              <geo-search-visibility value='1' />
            </viewpoint>
          </viewpoints>
          <active id='-1' />
        </window>
        <window class='dashboard' name='Figure 8-60 thought 8-65'>
          <viewpoints>
            <viewpoint name='Heat Map'>
              <zoom type='entire-view' />
              <geo-search-visibility value='1' />
            </viewpoint>
            <viewpoint name='Lightbulb'>
              <zoom type='entire-view' />
              <geo-search-visibility value='1' />
            </viewpoint>
            <viewpoint name='Sales Histogram'>
              <zoom type='entire-view' />
              <geo-search-visibility value='1' />
            </viewpoint>
          </viewpoints>
          <active id='-1' />
        </window>
</windows>

I want to extract and keep the "good filter design" and "poor filter design" in one row and the remaining three view point names as a second row.

My attempt:

root = getroot('example.xml')
for i in root.findall('windows/window/viewpoints/viewpoint'):
    print(i.get('name'))
1
  • Use Beautifulsoup. Commented Feb 17, 2018 at 11:19

2 Answers 2

1

Using elementtree should be as easy. I don't know what getroot() do exactly, but if it really return root element of the XML document, then you shouldn't mention window in the findall parameter :

>>> from xml.etree import ElementTree as ET
>>> raw = '''your XML string'''
>>> root = ET.fromstring(raw)
>>> for v in root.findall('window/viewpoints'):
...     print([a.get('name') for a in v.findall('viewpoint')])
... 
['Good Filter Design', 'Poor Filter Design']
['Heat Map', 'Lightbulb', 'Sales Histogram']

demo

Sign up to request clarification or add additional context in comments.

3 Comments

how can i add 1st two values in a list and next three values 'heatmap , lightbulb, sales histogram' in another list ?
tq @har07 for your quick response
List that u defined is static, so in my problem this list order as well as numbers may differ. Any idea how to generate list dynamically.? @har07
0

If you can use beautifulsoup thismuch easy it is

from bs4 import BeautifulSoup
#xml = """your xml"""
soup = BeautifulSoup(xml, 'lxml')
names = [viewpt["name"] for viewpt in soup.find_all('viewpoint')]

This will give every tag named 'viewpoint'

If you only want nested one use this:

names = [viewpoint["name"]
        for windows in soup.find_all('windows')
            for window in windows.find_all("window")
                for viewpoints in window.find_all("viewpoints")
                    for viewpoint in viewpoints.find_all("viewpoint")]

in your case both will give:

Out[18]: 
['Good Filter Design',
 'Poor Filter Design',
 'Heat Map',
 'Lightbulb',
 'Sales Histogram']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.