0

I have to parse a XML where the xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" namespace is missing, so the xml looks like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<program>
  <scriptList>
  <script type="StartScript">
    <isUserScript>false</isUserScript>
  </script>
  </scriptList>
</program>

but should look like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<program xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
  <scriptList>
    <script xsi:type="StartScript">
      <isUserScript>false</isUserScript>
    </script>
  </scriptList>
</program>

The type attribute ensures the correct subclass e.g.

class StartScript : script
{...}

The parser is auto generated from an handwritten xsd via $> xsd.exe a.xsd /classes (.Net). Here is the xsd:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
elementFormDefault="qualified" attributeFormDefault="qualified">

  <!-- Main element -->
  <xs:element name="program">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="scriptList">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="script" type="script" maxOccurs="unbounded"/>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

  <xs:complexType name="script" />

  <xs:complexType name="StartScript">
    <xs:complexContent>
      <xs:extension base="script">
        <xs:all>
          <xs:element name="isUserScript" type="xs:boolean"></xs:element>
        </xs:all>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

A simple solution is to run a string-replace (" type=\"" to " xsi:type=\"") on the input XML but this is pretty ugly. Is there a better solution?

2
  • How are you parsing it? Are you using XmlSerializer, or something else? Are there other types of script or only StartScript? Commented Mar 21, 2016 at 18:48
  • Parsing is simple in C#: _program = (new XmlSerializer(typeof(program))).Deserialize(f) as program; There are a lot of scripts, StartScript s just one example Commented Mar 21, 2016 at 19:21

1 Answer 1

1

You can load your XML into an intermediate LINQ to XML XDocument, fix the attribute namespaces on the <script> elements, then deserialize directly to your final class:

// Load to intermediate XDocument
XDocument xDoc;
using (var reader = XmlReader.Create(f))
    xDoc = XDocument.Load(reader);

// Fix namespace of "type" attributes
XNamespace xsi = "http://www.w3.org/2001/XMLSchema-instance";
foreach (var element in xDoc.Descendants("script"))
{
    var attr = element.Attribute("type");
    if (attr == null)
        continue;
    var newAttr = new XAttribute(xsi + attr.Name.LocalName, attr.Value);
    attr.Remove();
    element.Add(newAttr);
}

// Deserialize directly to final class.
var program = xDoc.Deserialize<program>();

Using the extension method:

public static class XObjectExtensions
{
    public static T Deserialize<T>(this XContainer element, XmlSerializer serializer = null)
    {
        if (element == null)
            throw new ArgumentNullException();
        using (var reader = element.CreateReader())
            return (T)(serializer ?? new XmlSerializer(typeof(T))).Deserialize(reader);
    }
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.