I know there are several answers to questions regarding xml parsing with Python 3, but I can't find the answer to two that I have. I am trying to parse and extract information from a BoardGameGeek xml file that looks like the following (it's too long for me to paste in here):
https://www.boardgamegeek.com/xmlapi/boardgame/10
1) I am having trouble extracting the primary game name from these two lines:
<name sortindex="1" primary="true">Elfenland</name>
<name sortindex="1">Elfenland (Волшебное Путешествие)</name>
2) I am also having trouble extracting lists of data, such as in this xml:
<poll title="User Suggested Number of Players" totalvotes="96" name="suggested_numplayers">
<results numplayers="1">
<result numvotes="0" value="Best"/>
<result numvotes="0" value="Recommended"/>
<result numvotes="58" value="Not Recommended"/>
</results>
<results numplayers="2">
<result numvotes="2" value="Best"/>
<result numvotes="21" value="Recommended"/>
<result numvotes="53" value="Not Recommended"/>
</results>
<results numplayers="3">
<result numvotes="10" value="Best"/>
<result numvotes="46" value="Recommended"/>
<result numvotes="17" value="Not Recommended"/>
</results>
<results numplayers="4">
<result numvotes="47" value="Best"/>
<result numvotes="36" value="Recommended"/>
<result numvotes="1" value="Not Recommended"/>
</results>
<results numplayers="5">
<result numvotes="35" value="Best"/>
<result numvotes="44" value="Recommended"/>
<result numvotes="2" value="Not Recommended"/>
</results>
<results numplayers="6">
<result numvotes="23" value="Best"/>
<result numvotes="48" value="Recommended"/>
<result numvotes="11" value="Not Recommended"/>
</results>
<results numplayers="6+">
<result numvotes="0" value="Best"/>
<result numvotes="1" value="Recommended"/>
<result numvotes="46" value="Not Recommended"/>
</results>
</poll>
Currently, my code is very simple, and looks like this. It only extracts simple one value xml lines. Any help on how to extract the more complex information would be great. Thank you.
url = 'https://www.boardgamegeek.com/xmlapi/boardgame/10'
response = urllib.request.urlopen(url)
data = response.read() # a `bytes` object
text = data.decode('utf-8') # a `str`;
soup = BeautifulSoup(text,'xml')
yearpublished = soup.find_all('yearpublished')