1

I have an issue to extract and save in a list some variables from an XML file. Here's a part of the XML file:

  '''  <CoverRequirements>
        <DayOfWeekCover>
          <Day>Monday</Day>
          <Cover>
            <Shift>E</Shift>
            <Preferred>2</Preferred>
          </Cover>
          <Cover>
            <Shift>L</Shift>
            <Preferred>2</Preferred>
          </Cover>
        </DayOfWeekCover>
        <DayOfWeekCover>
          <Day>Tuesday</Day>
          <Cover>
            <Shift>E</Shift>
            <Preferred>2</Preferred>
          </Cover>
          <Cover>
            <Shift>L</Shift>
            <Preferred>2</Preferred>
          </Cover>
        </DayOfWeekCover>
       </CoverRequirements>
      '''

Here's my code:

import xml.etree.ElementTree as ET

xmlfile = 'sprint01.xml'

tree = ET.parse(xmlfile)
root = tree.getroot()

days_week_cover = []
shift_cover = []
preferred_cover = []

for cover_data in root.find('CoverRequirements'):
    
    #level 2
    days = cover_data.find('Day')
    days_week_cover.append(days.text)
    #level 3
    cover = cover_data.find('Cover')
    
    shift = cover.find('Shift')
    shift_cover.append(shift.text)
    
    preferred1 = cover.find('Preferred')
    preferred_cover.append(preferred1.text)
    
print(days_week_cover) #I get: ['Monday', 'Tuesday']
print(shift_cover) # I get ['E','E'] instead of ['E','L','E','L']
print(preferred_cover) # I get ['2','2'] instead of ['2','2','2','2'] 

For the variables shift_cover and preferred_cover instead of getting ['E','L','E','L'] and ['2','2','2','2'] I get ['E','E'], ['2','2']. It looks like it only save in the list the first element of the level 3, XML file. I tried some variations by including a new for in function in the level 3 code in order to iterate in all elements of the level 3 but I get an error. Any help would be appreciated, thank you ! In term of time to solve the code, is this optimal ?

2 Answers 2

1

You are probably better off using lxml with xpath, for this type of thing:

from lxml import etree
cover = """[your xml above]"""
doc = etree.XML(cover)

days_week_cover = []
shift_cover = []
preferred_cover = []

for period in doc.xpath('//DayOfWeekCover'):
    days_week_cover.append(period.xpath('.//Day')[0].text)
    for shift in period.xpath('.//Shift'):
        shift_cover.append(shift.text)
    for pref in period.xpath('.//Preferred'):
        preferred_cover.append(pref.text)

print(days_week_cover)
print(shift_cover)
print(preferred_cover)

Output:

['Monday', 'Tuesday']
['E', 'L', 'E', 'L']
['2', '2', '2', '2']
Sign up to request clarification or add additional context in comments.

Comments

0

I suggest you to use the xml2dict library to have great control over your object.

for installing xml2dict you can use the following command

pip install xmltodict

And then you can use it in your code like this

import xmltodict, json

obj = xmltodict.parse("""
<employees>
    <employee>
        <name>Dave</name>
        <role>Sale Assistant</role>
        <age>34</age>
    </employee>
</employees>
""")

print(obj)

P.S: you can read your file as a string using the following snippet code

with open('data.xml', 'r') as file:
    data = file.read()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.