1

I'm trying to parse the summary and results entities from the following xml file:

XML file

A small snippet:

<result>
            <resultType>Potential Problem</resultType>
            <lineNum>296</lineNum>
            <columnNum>29</columnNum>
            <errorMsg>&lt;a href=&quot;https://achecker.ca/checker/suggestion.php?id=43&quot;
               onclick=&quot;AChecker.popup('https://achecker.ca/checker/suggestion.php?id=43'); return false;&quot;
               title=&quot;Suggest improvements on this error message&quot; target=&quot;_new&quot;&gt;&lt;code&gt;h2&lt;/code&gt; may be used for formatting.&lt;/a&gt;
            </errorMsg>
            <errorSourceCode>&lt;h2&gt;O portal netemprego.gov.pt foi substitu&iacute;do pelo iefponline.&lt;/h2&gt;</errorSourceCode>
            <sequenceID>296_29_43</sequenceID>
            <decisionPass>This &lt;code&gt;h2&lt;/code&gt; element is really a section header.</decisionPass>
            <decisionFail>This &lt;code&gt;h2&lt;/code&gt; element is used to format text (not really a section header).</decisionFail>
        </result>

I'm getting an error message: xml.etree.ElementTree.ParseError: undefined entity: line 55, column 51. I know that this error is related to the encoding. The file is presented with a UTF-8 header tag which sounds to be the right one to the chars contained in the XML. After reading about this and trying multiple workarounds i'm not able to avoid that error. What can i do in python to change it and parse summary and results entities?

1 Answer 1

1

No, it's nothing to do with encoding. It's because you have an entity reference &iacute; that is not defined anywhere. If it was HTML, this entity name would be built in, but that's not the case for XML. Apart from a handful of entities like amp and lt, entity references in XML are not recognised unless they are defined in the DTD.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.