1

I'm trying to parse an XML file.
I succeeded at parsing tags at the upper layer, but now I have a tag within a tag and I'm not getting the correct output.

XML FILE:

<?xml version="1.0" encoding="UTF-8"?>
    <Stations>
    <Station>
    <Code>HT</Code>
    <Type>knooppuntIntercitystation</Type>
    <Namen>
    <Kort>Den Bosch</Kort>
    <Middel>'s-Hertogenbosch</Middel>
    <Lang>'s-Hertogenbosch</Lang>
    </Namen>
    <Land>NL</Land>
    <Synoniemen>
    <Synoniem>Hertogenbosch ('s)</Synoniem>
    <Synoniem>Den Bosch</Synoniem>
    </Synoniemen>
    </Station>
    <Station>
    <Code>ALMO</Code>
    <Type>stoptreinstation</Type>
    <Namen>
    <Kort>Oostvaard</Kort>
    <Middel>Oostvaarders</Middel>
    <Lang>Almere Oostvaarders</Lang>
    </Namen>
    <Land>NL</Land>
    <Synoniemen>
    </Synoniemen>
    </Station>
    <Station>
    <Code>ATN</Code>
    <Type>stoptreinstation</Type>
    <Namen>
    <Kort>Aalten</Kort>
    <Middel>Aalten</Middel>
    <Lang>Aalten</Lang>
    </Namen>
    <Land>NL</Land>
    <Synoniemen>
    </Synoniemen>
    </Station>
    <Station>
    <Code>ASA</Code>
    <Type>intercitystation</Type>
    <Namen>
    <Kort>Amstel</Kort>
    <Middel>Amsterdam Amstel</Middel>
    <Lang>Amsterdam Amstel</Lang>
    </Namen>
    <Land>NL</Land>
    <Synoniemen>
    </Synoniemen>
    </Station>
    </Stations>

My python function:

import xml.etree.ElementTree

e = xml.etree.ElementTree.parse('info.xml').getroot()

for stationsnamens in e.findall('Station'):
    try:
        syn = stationsnamens.find('Synoniemen/Synoniem').text
        print(syn)
    except:
        print(Exception)

I'm trying to print every Synoniemen field there is, but only if it exists. Also, the 'Code' needs to be printed.

Output Format:

{Code}: {Synoniemen}
1
  • I've read it, but it only says how to work with the upper layer. I can't find out how I can go Station/Synoniemen/Synoniem. If it would be just Synoniemen for example, I know how to do it. Commented Oct 5, 2016 at 10:07

1 Answer 1

2

something like this (note: I have used .fromstring() in this example, but you can modify this for your own use with files)

import xml.etree.ElementTree
xmlstring = "<root><synoniemen><synoniem>A</synoniem><synoniem>B</synoniem></synoniemen></root>"
e = xml.etree.ElementTree.fromstring(xmlstring)
syn = e.find('synoniemen')
for synoniem in syn:
    print(synoniem.text)

point is that syn is a iterable with a for as it contains multiple elements.

So your code will look something like this:

for stationsnamens in e.findall('Station'):
    code = stationsnames.find('Code')
    try:
        syn = stationsnamens.find('Synoniemen')
        for synoniem in syn:
            print(code.text, synoniem.text)
    except:
        print(Exception)
Sign up to request clarification or add additional context in comments.

4 Comments

TypeError: 'NoneType' object is not iterable Is what I get
did you change from syn = stationsnamens.find('Synoniemen/Synoniem').text to syn = stationsnamens.find('Synoniemen') ??
Thank you, this is perfect. Do you also know how I can add the <Code> in front of the synoniem?
@SomeName, updated answer with 'Code'... if you want more, I advise you to 'play around with it' ... its what I did when I learned this (not too long ago) ... good luck :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.