3

I have this input XML which needs to be transformed with an xslt

<root>
    <node id="a">
        <section id="a_1" method="run">
            <item id="0">
                <attribute>
                    <color>Red</color>
                </attribute>
            </item>
        </section>
        <section id="a_2">
            <item id="0">
                <attribute>
                    <color>Red</color>
                </attribute>
            </item>
        </section>
        <section id="a_1" method="run">
            <item id="0">
                <attribute>
                    <color>Red</color>
                </attribute>
            </item>
        </section>
    </node>
    <node id="b">
        <section id="b_1" method="create">
            <user id="b_1a">
                <attribute>
                    <name>John</name>
                </attribute>
            </user>
            <user id="b_1b">
                <attribute>a</attribute>
            </user>
        </section>
        <section id="b_1" method="create">
            <user id="b_1c">
                <attribute>a</attribute>
            </user>
        </section>
        <section id="b_2">
            <user id="b_1a">
                <attribute>
                    <name>John</name>
                </attribute>
            </user>
        </section>
    </node>
</root>

Expected output:

<root>
    <node id="a">
        <section id="a_1">
            <item id="0">
                <attribute>
                    <color>Red</color>
                </attribute>
            </item>
        </section>
        <section id="a_2">
            <item id="0">
                <attribute>
                    <color>Red</color>
                </attribute>
            </item>
        </section>
    </node>
    <node id="b">
        <section id="b_1" method="create">
            <user id="b_1a">
                <attribute>
                    <name>John</name>
                </attribute>
            </user>
            <user id="b_1b">
                <attribute>a</attribute>
            </user>
        </section>

        <section id="b_2">
            <user id="b_1a">
                <attribute>
                    <name>John</name>
                </attribute>
            </user>
        </section>
    </node>
</root>

It does not matter which node will be eliminated, as long as it has the same element name, id and method, one of them will be removed. Any idea what the xsl looks like ?

Note: the element name can be anything doesn't have to be and it has more than one element name in the whole file; as long as it has the same element name, id and attribute (ex. method=create) one of them will be eliminated.

Thanks very much. cheers, John

2 Answers 2

3

I. Here is a short and efficient (using keys) XSLT 1.0 transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kElemWithAttribs" match="*[@id and @method]"
  use="concat(name(), '+', @id, '+', @method)"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match=
  "*[@id and @method
    and
     not(generate-id()
        =
         generate-id(key('kElemWithAttribs',
                         concat(name(), '+', @id, '+', @method)
                         )[1]
                    )
         )
     ]"/>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<root>
    <node id="a">
        <section id="a_1" method="run">
            <item id="0">
                <attribute>
                    <color>Red</color>
                </attribute>
            </item>
        </section>
        <section id="a_2">
            <item id="0">
                <attribute>
                    <color>Red</color>
                </attribute>
            </item>
        </section>
        <section id="a_1" method="run">
            <item id="0">
                <attribute>
                    <color>Red</color>
                </attribute>
            </item>
        </section>
    </node>
    <node id="b">
        <section id="b_1" method="create">
            <user id="b_1a">
                <attribute>
                    <name>John</name>
                </attribute>
            </user>
            <user id="b_1b">
                <attribute>a</attribute>
            </user>
        </section>
        <section id="b_1" method="create">
            <user id="b_1c">
                <attribute>a</attribute>
            </user>
        </section>
        <section id="b_2">
            <user id="b_1a">
                <attribute>
                    <name>John</name>
                </attribute>
            </user>
        </section>
    </node>
</root>

the wanted, correct result is produced:

<root>
   <node id="a">
      <section id="a_1" method="run">
         <item id="0">
            <attribute>
               <color>Red</color>
            </attribute>
         </item>
      </section>
      <section id="a_2">
         <item id="0">
            <attribute>
               <color>Red</color>
            </attribute>
         </item>
      </section>
   </node>
   <node id="b">
      <section id="b_1" method="create">
         <user id="b_1a">
            <attribute>
               <name>John</name>
            </attribute>
         </user>
         <user id="b_1b">
            <attribute>a</attribute>
         </user>
      </section>
      <section id="b_2">
         <user id="b_1a">
            <attribute>
               <name>John</name>
            </attribute>
         </user>
      </section>
   </node>
</root>

Explanation:

Using the Muenchian method for grouping with a composite key. Here we ignore (delete) every node that isn't the first in a group.


II. XSLT 2.0 solution -- even shorter and not less efficient:

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="node">
  <xsl:copy>
    <xsl:apply-templates select="@*"/>

    <xsl:for-each-group select="*"
         group-by="concat(name(), '+', @id, '+', @method)">
      <xsl:apply-templates select="."/>
    </xsl:for-each-group>
  </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

Explanation:

Proper use of xsl:for-each-group with a group-by attribute.

Sign up to request clarification or add additional context in comments.

4 Comments

just one more thing.. how to fix this code if the removal of duplicate can only occur if it is under the same parents (section id). Thanks.
@John: Could you, please, ask a new question -- with a source XML document and a wanted result, and explain the new requirement? Then notify me and I would be glad to answer it.
I will post in a few minutes and notify you. Thanks for your time.
here is the question stackoverflow.com/questions/10274305/… i apologize if this is very complicated. really have no idea. Thanks once again.
1

The XSL file:

<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
    <xsl:apply-templates/>
</xsl:template>

<xsl:template match="root">
    <root>
        <xsl:apply-templates/>
    </root>
</xsl:template>

<xsl:template match="node">
    <node>
        <xsl:copy-of select="@*"/>
        <xsl:apply-templates/>
    </node>
</xsl:template>

<xsl:template match="section[@id = 'b_1'][1]">  
    <section>
        <xsl:copy-of select="node()|@*"/>
    </section>  
</xsl:template>

<xsl:template match="section[@id != 'b_1']">
    <section>       
        <xsl:copy-of select="node()|@*"/>
    </section>
</xsl:template>

<xsl:template match="section[@id = 'b_1'][position() &gt; 1]"/> 

</xsl:stylesheet>

The transformation:

<root>
        <node id="a">
            <section id="a_1">
                <item id="0">
                    <attribute>
                        <color>Red</color>
                    </attribute>
                </item>
            </section>
            <section id="a_2">
                <item id="0">
                    <attribute>
                        <color>Red</color>
                    </attribute>
                </item>
            </section>
        </node>
        <node id="b">
            <section id="b_1" method="create">
                <user id="b_1a">
                    <attribute>
                        <name>John</name>
                    </attribute>
                </user>
                <user id="b_1b">
                    <attribute>a</attribute>
                </user>
            </section>
            <section id="b_2">
                <user id="b_1a">
                    <attribute>
                        <name>John</name>
                    </attribute>
                </user>
            </section>
        </node>
    </root>

Hope this helps.

[EDIT] Try this XSL on the same input file:

<?xml version='1.0'?>
<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:template match="/">
        <xsl:apply-templates/>
    </xsl:template>

    <xsl:template match="*[not(@id eq preceding::*[local-name() eq local-name(.)]/@id)]">
        <xsl:element name="{name()}">
            <xsl:copy-of select="node()|@*"/>
        </xsl:element>
    </xsl:template>

</xsl:stylesheet>

and the result:

<root>
  <node id="a">
    <section id="a_1">
      <item id="0">
        <attribute>
          <color>Red</color>
        </attribute>
      </item>
    </section>
    <section id="a_2">
      <item id="0">
        <attribute>
          <color>Red</color>
        </attribute>
      </item>
    </section>
  </node>
  <node id="b">
    <section id="b_1" method="create">
      <user id="b_1a">
        <attribute>
          <name>John</name>
        </attribute>
      </user>
      <user id="b_1b">
        <attribute>a</attribute>
      </user>
    </section>
    <section id="b_1" method="create">
      <user id="b_1c">
        <attribute>a</attribute>
      </user>
    </section>
    <section id="b_2">
      <user id="b_1a">
        <attribute>
          <name>John</name>
        </attribute>
      </user>
    </section>
  </node>
</root>

[/EDIT]

5 Comments

thanks for the response, but as I mentioned in the question, we need to remove only 1 duplicated node node both. so in that case we still need to keep <section id="b_1" method="create"> and the corresponding children. Do you mind updating the solutions? Thanks a lot.
Thanks @Cylian! Really appreciate that. Just one more thing is it possible to generalize, I mean like it can be any element, does not have to be <section> as long as the element name, id and attributes (method=create) is the same, one of them will be eliminated
Which XSLT version you're using?
See this post and this
it's different that did not have attributes

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.