I am reading/parsing an XML file with javax.xml.stream.XMLStreamReader.
The file contains this piece of XML data as shown below.
<Row>
<AccountName value="Paving 101" />
<AccountNumber value="20205" />
<AccountId value="15012" />
<TimePeriod value="2019-08-20" />
<CampaignName value="CMP Paving 101" />
<CampaignId value="34283" />
<AdGroupName value="residential paving" />
<AdGroupId value="1001035" />
<AdId value="790008" />
<AdType value="Expanded text ad" />
<DestinationUrl value="" />
<BidMatchType value="Broad" />
<Impressions value="1" />
<Clicks value="1" />
<Ctr value="100.00%" />
<AverageCpc value="1.05" />
<Spend value="1.05" />
<AveragePosition value="2.00" />
<SearchQuery value="concretedrivewayrepairmethods" />
</Row>
Unfortunately I am getting this error and I am not sure how to resolve it.
Error in downloadXML:
com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x19
at [row,col {unknown-source}]: [674,40]
at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:606)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:479)
at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2448)
at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2395)
at com.ctc.wstx.sr.StreamScanner.resolveSimpleEntity(StreamScanner.java:1218)
at com.ctc.wstx.sr.BasicStreamReader.parseAttrValue(BasicStreamReader.java:1929)
at com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:3063)
at com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2961)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2837)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1072)
The problem seems to be with this character .
Of course I can first read the file simply as a text file, and replace this bad character, and only then parse it with XMLStreamReader but:
1) that approach seems really clumsy to me;
2) it will be a bit difficult to do as the code is quite involved there,
so I am not sure if I want to change it just for this character.
Why is the XMLStreamReader unable to handle this character?
Is the XML invalid or the parser has a bug and does not handle it well?
character is not allowed in XML 1.0 (w3.org/TR/xml/#charsets). I'm not sure if it helps you, but the character is allowed in XML 1.1 (w3.org/TR/xml11/#charsets).<?xml version="1.0" encoding="utf-8"?>so it declares that it's 1.0. Right?