1

My (simplified) input XML file contains the following:

<?xml version="1.0" encoding="UTF-8"?>
<main>
    <DATA_RECORD>
        <MESSAGE>&#60;pd&#62;&#10;    &#60;cdhead version&#61;&#34;13&#34;/&#62;&#10;&#60;/pd&#62;</MESSAGE>
    </DATA_RECORD>
</main>

The MESSAGE element value is a character-escaped XML instance. It represents the following XML:

<pd>
    <cdhead version="13"/>
</pd>

I would like to apply an xsl transformation on the input XML and somehow parse the MESSAGE contents into a variable and use Xpath expressions to access its details.
I tried adding a javascript function as below, but the object returned by the script apparently is of an incorrect DOM subclass (see result underneath). For completeness, I added an extra function that returns the DOM contents as a string.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:ms="urn:schemas-microsoft-com:xslt"
    xmlns:my="http://example.com/my"
    exclude-result-prefixes="ms my">

    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

    <ms:script language="JScript" implements-prefix="my">
        <![CDATA[
        function parseToDOM (input) {
        var doc = new ActiveXObject('Msxml2.DOMDocument.6.0');
        doc.loadXML (input);
        return doc.documentElement;
        };
        function parseToXMLString (input) {
        var doc = new ActiveXObject('Msxml2.DOMDocument.6.0');
        doc.loadXML (input);
        return doc.documentElement.xml;
        };
        ]]>
    </ms:script>

    <xsl:template match="/">
        <root>
            <xsl:apply-templates/>
        </root>
    </xsl:template>

    <xsl:template match="DATA_RECORD">
            <xsl:variable name="DOM"><xsl:copy-of select="my:parseToDOM (MESSAGE)"/></xsl:variable>
            <xsl:variable name="XML"><xsl:copy-of select="my:parseToXMLString (MESSAGE)"/></xsl:variable>

            <msg1><xsl:value-of select="$XML"/></msg1>
            <msg2><xsl:value-of select="$XML" disable-output-escaping="yes"/></msg2>
            <dom><xsl:copy-of select="$DOM"/></dom>
            <version><xsl:value-of select="$DOM/pd/cdhead/@version"/></version>
    </xsl:template>

    <xsl:template match="text()"/>
</xsl:stylesheet>

Result:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <msg1>&lt;pd&gt;
    &lt;cdhead version="13"/&gt;
&lt;/pd&gt;</msg1>
    <msg2><pd>
    <cdhead version="13"/>
</pd></msg2>
    <dom/>
    <version></version>
</root>

How can I make the Jscript function return a result that allows the use of Xpath?
By the way, is there some XSLT 1.0 function available that allows parsing the escaped XML string to a result that allows the use of Xpath?

ADDITION

I have been trying some variations and got closer to a solution. First, Altova XMLSpy allows choosing the xsl processor, and the above resulted when using the built-in one. Of course I need MSXML 6.0 and when choosing that one, errors occurred as I had to parse input.text instead. But I only succeeded in being able to use Xpath expressions in the result after doing extra stuff in the javascript. It transpired that while &#60; and the like are parsed into &lt; etcetera, this is not enough to arrive at the proper DOM result. So I resorted to unescaping the input string first.
But I hit another snag: where the below works fine, it does not when I use input.text instead of the literal below.

See below the xslt.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:ms="urn:schemas-microsoft-com:xslt"
    xmlns:my="http://example.com/my"
    exclude-result-prefixes="ms my">

    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

    <ms:script language="JScript" implements-prefix="my">
        <![CDATA[
        function parseToDOM (input) {
            var doc = new ActiveXObject('Msxml2.DOMDocument.6.0');
            doc.loadXML (unescapeXML ('&#60;pd&#62;&#10;    &#60;cdhead version&#61;&#34;13&#34;/&#62;&#10;&#60;/pd&#62;'));
            //doc.loadXML (unescapeXML (input.text));
            return doc;
        };
        function unescapeXML (str) {
            var ostr = str;
            ostr = ostr.replace (/&#34;/g, '"');
            ostr = ostr.replace (/&#60;/g, '<');
            ostr = ostr.replace (/&#61;/g, '=');
            ostr = ostr.replace (/&#62;/g, '>');
            return ostr;
        };
        ]]>
    </ms:script>

    <xsl:template match="/">
        <root>
            <xsl:apply-templates/>
        </root>
    </xsl:template>

    <xsl:template match="DATA_RECORD">
        <xsl:variable name="msg" select="my:parseToDOM (MESSAGE)"/>
        <tst><xsl:value-of select="$msg/pd/cdhead/@version"/></tst>
   </xsl:template>

</xsl:stylesheet>

Now results in

<?xml version="1.0" encoding="UTF-8"?>
<root>
<tst>13</tst>
</root>

Which is exactly what I want.

But as remarked above, when I comment the parsing of the literal and use the input instead, like so:

//doc.loadXML (unescapeXML ('&#60;pd&#62;&#10;    &#60;cdhead version&#61;&#34;13&#34;/&#62;&#10;&#60;/pd&#62;'));
doc.loadXML (unescapeXML (input.text));

I get the following error (in Altova XML Spy with MSXML 6.0 as xslt parser):

XSL transformation failed due to following error:

Microsoft JScript runtime error
'undefined' is null or not an object
line = 10, col = 3 (line is offset from the start of the script block).
Error returned from property or method call.

Which points at the first javascript replace statement.

And also, IE9 cannot process the following properly:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="test.xslt"?>
<main>
  <DATA_RECORD>
    <MESSAGE>&#60;pd&#62;&#10;    &#60;cdhead version&#61;&#34;13&#34;/&#62;&#10;&#60;/pd&#62;</MESSAGE>
  </DATA_RECORD>
 </main>

When I open this file in IE9 (where test.xslt is the version of the transformation where the input is ignored and instead a literal is processed, hence the one that is OK in XML Spy), I get a processing error:

XML5001: Applying Integrated XSLT Handling. 
XSLT8690: XSLT processing failed. 

Why is all this and how can I correct it?

2
  • 1
    Where are you trying to use this? Is it in .NET code, or in IE? Commented Feb 14, 2013 at 8:54
  • It's an XML Spy transformation, to be used by manual execution or as a referred to xslt from the XML file so that the XML file can be opened in IE9. The input comes from a Toad query result save where one of the columns queried contains an XML string. Commented Feb 14, 2013 at 8:57

1 Answer 1

2

Starting from the ADDITION above, I reached a solution by finetuning it a little.
To avoid having to do input.text and use plain input instead, the xsl has to contain a conversion of the element to a string by applying the xslt string function (I thought it was a string already, but apparently that is not the case). In addition, it was not necessary any more to apply the replace statements now.
Thus

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:ms="urn:schemas-microsoft-com:xslt"
    xmlns:my="http://example.com/my"
    exclude-result-prefixes="ms my">

    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

    <ms:script language="JScript" implements-prefix="my">
        <![CDATA[
        function parseToDOM (input) {
            var doc = new ActiveXObject('Msxml2.DOMDocument.6.0');
            doc.loadXML (input);
            return doc;
        };
        ]]>
    </ms:script>

    <xsl:template match="/">
        <root>
            <xsl:apply-templates/>
        </root>
    </xsl:template>

    <xsl:template match="DATA_RECORD">
        <xsl:variable name="msg" select="my:parseToDOM (string(MESSAGE))"/>
        <tst><xsl:value-of select="$msg/pd/cdhead/@version"/></tst>
   </xsl:template>

</xsl:stylesheet>

works: when applied on

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="test.xslt"?>
<main>
  <DATA_RECORD>
    <MESSAGE>&#60;pd&#62;&#10;    &#60;cdhead version&#61;&#34;13&#34;/&#62;&#10;&#60;/pd&#62;</MESSAGE>
  </DATA_RECORD>
 </main>

the result is

<?xml version="1.0" encoding="UTF-8"?>
<root>
<tst>13</tst>
</root>

Unluckily, IE9 still fails in loading the XML with referred XSLT; and I discovered why.
I had to tick the box in Internet Options/Advanced/Security/Allow active content to run in files on My Computer - and also restart IE - this makes IE9 process the file correctly. Of course, the result not being html means that the result can only be viewed in F12/Script tab, but this was just an example and I will incorporate it in an xslt that generates proper html.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.