0

I'm trying to convert my xml-request (see example below) into a pandas-dataframe, but it doesn't work the way it should and I'm not sure why.

Example xml-request

<workingTimes>
<day>
    <date>2015-09-21</date>
    <dayOfWeek>Mon</dayOfWeek>
    <employee>
        <firstName>Albert</firstName>
        <lastName>Grimaldi</lastName>
        <login xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <personnelNumber xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <duration>00:00:00</duration>
        <rest mandatory="00:00:00">00:00:00</rest>
        <costCenter>AB-1234</costCenter>
    </employee>
    <employee>
        <firstName>Max</firstName>
        <lastName>Mustermann</lastName>
        <login xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <personnelNumber>12346</personnelNumber>
        <duration>00:00:00</duration>
        <rest mandatory="00:00:00">00:00:00</rest>
        <costCenter>AB-1234</costCenter>
    </employee>
</day>
<day>
    <date>2015-09-22</date>
    <dayOfWeek>Tue</dayOfWeek>
    <employee>
        <firstName>Albert</firstName>
        <lastName>Grimaldi</lastName>
        <login xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <personnelNumber xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <duration>00:00:00</duration>
        <rest mandatory="00:00:00">00:00:00</rest>
        <costCenter>AB-1234</costCenter>
    </employee>
    <employee>
        <firstName>Max</firstName>
        <lastName>Mustermann</lastName>
        <login xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <personnelNumber>12346</personnelNumber>
        <duration>00:00:00</duration>
        <rest mandatory="00:00:00">00:00:00</rest>
        <costCenter>AB-1234</costCenter>
    </employee>
</day>
</workingTimes>

Code:

import pandas as pd
from xml.etree import ElementTree as et
...
r = requests.get(api_url, headers=headers)

root = et.fromstring(r.content)

df_cols, rows = ['date', 'dayOfWeek', 'firstName', 'lastName', 'duration', 'costCenter'], []
for child in root:
    s_date = child.attrib.get("date")
    s_dayOfWeek = child.attrib.get("dayOfWeek")
    s_firstName = child.find("firstName").text if child is not None else None
    s_lastName = child.find("lastName").text if child is not None else None
    s_duration= child.find("duration").duration if child is not None else None
    s_costCenter= child.find("costCenter").text if child is not None else None

    rows.append({'date': s_date, 'dayOfWeek': s_dayOfWeek, 'firstName': s_firstName, 'lastName': 
        s_lastName, 'duration': s_duration, 's_costCenter': costCenter})

df_xml = pd.DataFrame(rows, columns=df_cols)

And this is a part of the documentary:API Documentary

Can anyone tell me what I'm doing wrong?

3
  • What is the problem? Note that date and dayOfWeek are not at the same level as the rest if the properties you are looking for. Commented Dec 14, 2020 at 13:47
  • @balderman I'm getting an empty dataframe, even though I know there are entrys in my xml! It's like I'm unable to get into the subelement (is it called that way?). Commented Dec 14, 2020 at 13:54
  • start by testing the code I have posted and populate the df. If it works - extend it. Commented Dec 14, 2020 at 14:03

1 Answer 1

2

see below (just extend the code in order to collect more elements)

import xml.etree.ElementTree as ET

XML = '''<workingTimes>
<day>
    <date>2015-09-21</date>
    <dayOfWeek>Mon</dayOfWeek>
    <employee>
        <firstName>Albert</firstName>
        <lastName>Grimaldi</lastName>
        <login xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <personnelNumber xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <duration>00:00:00</duration>
        <rest mandatory="00:00:00">00:00:00</rest>
        <costCenter>AB-1234</costCenter>
    </employee>
    <employee>
        <firstName>Max</firstName>
        <lastName>Mustermann</lastName>
        <login xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <personnelNumber>12346</personnelNumber>
        <duration>00:00:00</duration>
        <rest mandatory="00:00:00">00:00:00</rest>
        <costCenter>AB-1234</costCenter>
    </employee>
</day>
<day>
    <date>2015-09-22</date>
    <dayOfWeek>Tue</dayOfWeek>
    <employee>
        <firstName>Albert</firstName>
        <lastName>Grimaldi</lastName>
        <login xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <personnelNumber xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <duration>00:00:00</duration>
        <rest mandatory="00:00:00">00:00:00</rest>
        <costCenter>AB-1234</costCenter>
    </employee>
    <employee>
        <firstName>Max</firstName>
        <lastName>Mustermann</lastName>
        <login xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
        <personnelNumber>12346</personnelNumber>
        <duration>00:00:00</duration>
        <rest mandatory="00:00:00">00:00:00</rest>
        <costCenter>AB-1234</costCenter>
    </employee>
</day>
</workingTimes>'''
data = []
root = ET.fromstring(XML)
days = root.findall('.//day')
for d in days:
    emp_lst = d.findall('employee')
    for e in emp_lst:
        # TODO collect more data
        data.append(
            {'day': d.find('date').text, 'first_name': e.find('firstName').text, 'last_name': e.find('lastName').text})
for entry in data:
    print(entry)

output

{'day': '2015-09-21', 'first_name': 'Albert', 'last_name': 'Grimaldi'}
{'day': '2015-09-21', 'first_name': 'Max', 'last_name': 'Mustermann'}
{'day': '2015-09-22', 'first_name': 'Albert', 'last_name': 'Grimaldi'}
{'day': '2015-09-22', 'first_name': 'Max', 'last_name': 'Mustermann'}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.