1

Is it possible to define a XML Schema which only defines attribute datatypes and all node values as string?

I am dealing with a very large XML file which has the following structure:

<A>
    <A_x1 label="xyz" id="1234">string data</A_x1>
    <A_x2 label="xzy" id="1235">string data</A_x2>
    <A_x...>string data</A_x...>
    ...
</A>
<B>
    <B_x1 label="yzx" id="1236">string data</B_x1>
    <B_x2 label="zyx" id="1237">string data</B_x2>
    <B_x...>string data</B_x...>
    ...
</B>
<C>
    ...
</C>
...

The number of subnodes of A, or B, or ... is variable!

And sometimes there is a node A_x1 and sometimes there is not.

All i know for sure is every leave node has two attributes, where 'label' is of datatype string and 'id' is of datatype int.

I think this can be defined by :

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:attribute name="label" type="xs:string"/>
  <xs:attribute name="id"    type="xs:short"/>
</xs:schema>

But how do i define a schema which states every node named "A_x..." or "B_x..." or ... holds data of type string? Or every leave node holds data of type string.

I could not find a way to use regular expressions for node names in an XML schema, is this even possible? Or what is the solution to this problem? If there is one.

1 Answer 1

1

Unfortunately, even in XSD 1.1, the only way to allow elements that have any name is to permit them with an xs:any wildcard, and xs:any does not allow you to constrain the types of the elements that it matches.

You could define all the constraints with XSD 1.1 assertions:

every $e in child::* satisfies if exists($e/@id) then $e/@id castable as xs:short

but frankly, if you're doing this, then you're getting so little value out of XSD that you might as well use a different technology for your validation, e.g. XSLT or schematron.

Another possibility (often overlooked) is to validate using a pipeline that first transforms (using XSLT), and then validates (using XSD). In this case the transformation part would convert all the element names A_x1 into standard element name AA say.

A third possibility is to generate a schema for your specific instance document (again using XSLT) and then validate against that schema. In this case your generated schema could define all the A_x1 elements as members of the substitution group of some abstract element AA, and these elements would therefore be validated against the type defined for AA.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.