2

I cannot wrap my head around how to extract data from the following XML document.

I've downloaded an XML document through the ECB API.

import urllib.request

access_url = 'https://sdw-wsrest.ecb.europa.eu/service/data/EXR/D.USD.EUR.SP00.A?startPeriod=2000-01-01&endPeriod=2015-12-10'
response = urllib.request.urlretrieve(access_url, 'trial_savename.xml')

Which retrieves and saves an XML document that looks like this (first 37 lines shown):

<?xml version="1.0" encoding="UTF-8"?><message:GenericData xmlns:message="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message" xmlns:common="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:generic="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic" xsi:schemaLocation="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message https://sdw-wsrest.ecb.europa.eu:443/vocabulary/sdmx/2_1/SDMXMessage.xsd http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common https://sdw-wsrest.ecb.europa.eu:443/vocabulary/sdmx/2_1/SDMXCommon.xsd http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic https://sdw-wsrest.ecb.europa.eu:443/vocabulary/sdmx/2_1/SDMXDataGeneric.xsd">
<message:Header>
<message:ID>781631bf-c21e-4c88-9657-ae03c858b917</message:ID>
<message:Test>false</message:Test>
<message:Prepared>2015-12-11T16:56:20.723+01:00</message:Prepared>
<message:Sender id="ECB"/>
<message:Structure structureID="ECB_EXR1" dimensionAtObservation="TIME_PERIOD">
<common:Structure>
<URN>urn:sdmx:org.sdmx.infomodel.datastructure.DataStructure=ECB:ECB_EXR1(1.0)</URN>
</common:Structure>
</message:Structure>
</message:Header>
<message:DataSet action="Replace" validFromDate="2015-12-11T16:56:20.723+01:00" structureRef="ECB_EXR1">
<generic:Series>
<generic:SeriesKey>
<generic:Value id="FREQ" value="D"/>
<generic:Value id="CURRENCY" value="USD"/>
<generic:Value id="CURRENCY_DENOM" value="EUR"/>
<generic:Value id="EXR_TYPE" value="SP00"/>
<generic:Value id="EXR_SUFFIX" value="A"/>
</generic:SeriesKey>
<generic:Attributes>
<generic:Value id="SOURCE_AGENCY" value="4F0"/>
<generic:Value id="COLLECTION" value="A"/>
<generic:Value id="DECIMALS" value="4"/>
<generic:Value id="TITLE_COMPL" value="ECB reference exchange rate, US dollar/Euro, 2:15 pm (C.E.T.)"/>
<generic:Value id="UNIT" value="USD"/>
<generic:Value id="TITLE" value="US dollar/Euro"/>
<generic:Value id="UNIT_MULT" value="0"/>
</generic:Attributes>
<generic:Obs>
<generic:ObsDimension value="2000-01-03"/>
<generic:ObsValue value="1.009"/>
<generic:Attributes>
<generic:Value id="OBS_STATUS" value="A"/>
</generic:Attributes>
</generic:Obs>

I want to extract the ObsValue value for every ObsDimension value and keep working with those.

I've tried to use ElementTree in the following way:

import xml.etree.ElementTree as ET
tree = ET.parse('trial_savename.xml')
e = tree.findall('message:GenericData')

which returns an empty list []. I thought I could access the data like this e = tree.findall('message:GenericData/message:DataSet/generic:Series/generic:Obs/generic:ObsDimension value'), but doesn't seem to be way to do it.

What am I getting wrong?

0

1 Answer 1

2

You need to pass namespace argument.

>>> import xml.etree.ElementTree as ET
>>> tree = ET.parse('trial_savename.xml')
>>> ns = {'generic': "http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic" }
>>> dimensions = tree.findall('.//generic:ObsDimension', namespaces=ns)
>>> values = [dim.get('value') for dim in dimensions]
>>> values[:5]
['2000-01-03', '2000-01-04', '2000-01-05', '2000-01-06', '2000-01-07']

If you use lxml, you can use nsmap attribute, and xpath method:

>>> import lxml.etree as ET
>>> tree = ET.parse('trial_savename.xml')
>>> values = tree.xpath('.//generic:ObsDimension/@value', namespaces=tree.getroot().nsmap)
>>> values[:5]
['2000-01-03', '2000-01-04', '2000-01-05', '2000-01-06', '2000-01-07']
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.