0

I have a couple of huge XMLs and I need to sort only a some small portions of them. As output I should have the same XML but with sorted substructures. Here is an example:

<testStructure>
<parentStruct>
    <firstpPreChild>some value here</firstpPreChild>
    <secondPreChild>some other value</secondPreChild>
    <thirdPreChild>third value here</thirdPreChild>
    <fourtPreChild>fourth value here</fourtPreChild>
    <struct id="5">
        <num>5</num>
    </struct>
    <struct id="4">
        <num>4</num>
    </struct>
    <struct id="1">
        <num>1</num>
    </struct>
    <struct id="2">
        <num>2</num>
    </struct>
    <struct id="3">
        <num>3</num>
    </struct>
     <firstAdditionalChild>some value here</firstAdditionalChild>
    <secondAdditionalChild>some other value</secondAdditionalChild>
    <thirdAdditionalChild>third value here</thirdAdditionalChild>
    <fourtAdditionalChild>fourth value here</fourtAdditionalChild>-->
</parentStruct>
<otherStruct>
    <firstChild>some value here</firstChild>
    <secondChild>some other value</secondChild>
    <thirdChild>third value here</thirdChild>
    <fourtChild>fourth value here</fourtChild>
</otherStruct>

should be transformed to

<testStructure>
<parentStruct>
    <firstpPreChild>some value here</firstpPreChild>
    <secondPreChild>some other value</secondPreChild>
    <thirdPreChild>third value here</thirdPreChild>
    <fourtPreChild>fourth value here</fourtPreChild>
    <struct id="1">
        <num>1</num>
    </struct>
    <struct id="2">
        <num>2</num>
    </struct>
    <struct id="3">
        <num>3</num>
    </struct>
    <struct id="4">
        <num>4</num>
    </struct>
    <struct id="5">
        <num>5</num>
    </struct>
     <firstAdditionalChild>some value here</firstAdditionalChild>
    <secondAdditionalChild>some other value</secondAdditionalChild>
    <thirdAdditionalChild>third value here</thirdAdditionalChild>
    <fourtAdditionalChild>fourth value here</fourtAdditionalChild>-->
</parentStruct>
<otherStruct>
    <firstChild>some value here</firstChild>
    <secondChild>some other value</secondChild>
    <thirdChild>third value here</thirdChild>
    <fourtChild>fourth value here</fourtChild>
</otherStruct>

as sort criteria can be used either num or @id. I've tried some variation like this:

<xsl:template match="node()|@*">
<xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>

which works, but shifts the sorted structure from its original position. Unfortunately I need the same structure as output.

Thanks in advance for the help!

1
  • What do you expect if the <struct> elements are not adjacent, but have some other child in between? Commented Feb 14, 2011 at 13:05

1 Answer 1

3

Grouping by adjacent and then sorting, this XSLT 1.0 stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kStructByFirstPreceding"
             match="struct"
             use="generate-id(
                     preceding-sibling::struct[
                        not(preceding-sibling::*[1]/self::struct)
                     ][1]
                  )"/>
    <xsl:template match="node()|@*" name="identity">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="struct[not(preceding-sibling::*[1]/self::struct)]">
        <xsl:apply-templates select=".|key('kStructByFirstPreceding',
                                           generate-id())"
                             mode="copy">
            <xsl:sort select="@id"/>
        </xsl:apply-templates>
    </xsl:template>
    <xsl:template match="struct"/>
    <xsl:template match="node()" mode="copy">
        <xsl:call-template name="identity"/>
    </xsl:template>
</xsl:stylesheet>

Output:

<testStructure>
    <parentStruct>
        <firstpPreChild>some value here</firstpPreChild>
        <secondPreChild>some other value</secondPreChild>
        <thirdPreChild>third value here</thirdPreChild>
        <fourtPreChild>fourth value here</fourtPreChild>
        <struct id="1">
            <num>1</num>
        </struct>
        <struct id="2">
            <num>2</num>
        </struct>
        <struct id="3">
            <num>3</num>
        </struct>
        <struct id="4">
            <num>4</num>
        </struct>
        <struct id="5">
            <num>5</num>
        </struct>
        <firstAdditionalChild>some value here</firstAdditionalChild>
        <secondAdditionalChild>some other value</secondAdditionalChild>
        <thirdAdditionalChild>third value here</thirdAdditionalChild>
        <fourtAdditionalChild>fourth value here</fourtAdditionalChild>--&gt; 
    </parentStruct>
    <otherStruct>
        <firstChild>some value here</firstChild>
        <secondChild>some other value</secondChild>
        <thirdChild>third value here</thirdChild>
        <fourtChild>fourth value here</fourtChild>
    </otherStruct>
</testStructure>

Simpler XSLT 2.0 solution:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="*[struct]">
        <xsl:copy>
            <xsl:for-each-group select="*"
                                group-adjacent="boolean(self::struct)">
                <xsl:apply-templates select="current-group()">
                    <xsl:sort select="@id[current-grouping-key()]"/>
                </xsl:apply-templates>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>
Sign up to request clarification or add additional context in comments.

5 Comments

+1 for showing way to solve the problem. Alejandro, I have one question: I think the XSLT 2.0 solution with group-adjacent="boolean(self::struct)" will only work with an XSLT processor that automatically strips white space text nodes (like AltovaXML Tools does) or if you add <xsl:strip-space elements="*"/> to the code. In both cases I think you also need <xsl:output indent="yes"/> to get readable and indented output, as you posted. Have you deliberately ignored that to keep the samples short or are you working with some tool/editor which does the indenting automatically?
@Martin Honnen: You are right about stripping whitespace only text nodes, I was using Altova, editing now. About identation: a never pay attention to xsl:output/@indent because every processor does what it wants and because in production I would never use identation.
@Alejandro. +1 for no indentaion in production.
Hi, thanks for the great answer. I have another problem here. What happens if the struct structure exists elswhere in the tree?
@user612834: XSLT 1.0 answer will group in any hierarchy level. Fixing XSLT 2.0 solution according.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.