7

I'm getting the error below when I'm parsing the xml from the URL in the code. I won't post the XML because it's huge. The link is in the code below.

ERROR:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-70-77e5e1b79ccc> in <module>()
     11 
     12 for child in root.iter('Materia'):
---> 13     if not child.find('EmentaMateria').text is None:
     14             ementa = child.find('EmentaMateria').text
     15 

AttributeError: 'NoneType' object has no attribute 'text'

MY CODE:

url = 'http://legis.senado.leg.br/dadosabertos/senador/4988/autorias'
import requests
from xml.etree import ElementTree

response = requests.get(url, stream=True)
response.raw.decode_content = True

tree = ElementTree.parse(response.raw)

root = tree.getroot()

for child in root.iter('Materia'):
    if child.find('EmentaMateria').text is not None:
            ementa = child.find('EmentaMateria').text

    for child_IdMateria in child.findall('IdentificacaoMateria'):
        anoMateria = child_IdMateria.find('AnoMateria').text
        materia = child_IdMateria.find('NumeroMateria').text
        siglaMateria = child_IdMateria.find('SiglaSubtipoMateria').text



    print('Ano = '+anoMateria+' | Numero Materia = '+materia+' | tipo = '+siglaMateria+' | '+ementa)

What I'm overlooking here? Thanks

2 Answers 2

24

Instead of checking if child.find('EmentaMateria').text is not None, you should make sure that child.find('EmentaMateria') is not None first.

Also, you should store the returning value of child.find('EmentaMateria') to avoid calling it twice.

Lastly, you should assign ementa a default value if child.find('EmentaMateria') is None; otherwise your print function below will be referencing an un-initialized variable.

Change:

if child.find('EmentaMateria').text is not None:
    ementa = child.find('EmentaMateria').text

to:

node = child.find('EmentaMateria')
if node is not None:
    ementa = node.text
else:
    ementa = None

Alternatively, you can use the built-in function getattr to do the same without a temporary variable:

ementa = getattr(child.find('EmentaMateria'), 'text', None)
Sign up to request clarification or add additional context in comments.

2 Comments

Great, thanks! I just removed the .text in the if clause and it worked perfectly. I tried to use the getattr but it didn't work. It stops in the middle of the result and show an error: TypeError: must be str, not NoneType
Thanks, I'm currently working on some scraping and faced the same issue. I used the getattr function and worked with me perfectly. I'm scraping multiple data points and some of them doesn't have values and gave me error not finding .text. But now the code works fine.
0

If you are using the code to parse an xml file, open the xml file with a text editor and inspect the tags. In my case there were some rogue tags at the end. Once i removed those, the code worked as expected.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.