1

I want to parse 'invalid' xml using a streaming xml parser. I have two options

XmlReader.Create(...,
    new XmlReaderSettings() 
{ 
    CheckCharacters = false, 
    ConformanceLevel = ConformanceLevel.Fragment, 
    ValidationFlags = System.Xml.Schema.XmlSchemaValidationFlags.None,
    ValidationType = ValidationType.None 
}))

Second example

new XmlTextReader(...) { Namespaces = false, Normalization = false })

The first is failing on unrecognized namespaces which are presented in the xml: '...' is an undeclared prefix.

The second is failing on invalid characters: XmlException: '', hexadecimal value 0x13, is an invalid character. Line ...

Is there an option to combine both behaviors (Namespaces = false && CheckCharacters = false) so parsing will not fail on undefined namespaces and invalid characters?

Input "xml" cannot be changed as provided as is. It is also huge and cannot be loaded to the memory.

Update Xml example

<?xml version="1.0" encoding="UTF-8"?>
<x xmlns="http://www.w3.org/2005/Atom">
    <item>
        <my_ns:id>123 _0x13_here_ dd</my_ns:id>
        <other_ns:value>ABC</other_ns:value>
    </item>
</x>

Where _0x13_here_ is a (char)'\x13' I was wrong, and using CheckCharacters = false not helping here. It allows to avoid exceptions on content like &#x13; only.

3
  • XML with undeclared namespace prefixes is not 'invalid', it's not well formed. I don't think you'll have much luck with any parser reading it, really. Disabling namespace support in XmlTextReader will just mean you get errors that : isn't allowed in names. Commented Jun 9, 2016 at 8:15
  • @CharlesMager I've added an example of the xml. XmlTextReader { Namespaces = false } is able to read an xml with that type of namespaces without exceptions. Commented Jun 9, 2016 at 20:14
  • 1
    Please look at my answer to the similar question here Commented Oct 31, 2017 at 20:25

1 Answer 1

1

Here is a solution to combine:
- multiple root elements (ConformanceLevel.Fragment)
- undefined prefix (AddNamespace)

var settings = new XmlReaderSettings() {
    NameTable = new NameTable(),
    ConformanceLevel = ConformanceLevel.Fragment
};
var nsmgr = new XmlNamespaceManager(settings.NameTable);
nsmgr.AddNamespace("MyNamespace", "http://exemple.com");
var context = new XmlParserContext(null, nsmgr, null, XmlSpace.Default);
var reader = XmlReader.Create(stream, settings, context );
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.