How to output duplicate elements using XSLT?

Question

I have xml which looks something like this -

<Root>
  <Fields>
    <Field name="abc" displayName="aaa" />
    <Field name="pqr" displayName="ppp" />
    <Field name="abc" displayName="aaa" />
    <Field name="xyz" displayName="zzz" />
  </Fields>
</Root>

I want the output to contain only those elements which have a repeating name-displayName combination, if there are any -

<Root>
      <Fields>
        <Field name="abc" displayName="aaa" />
        <Field name="abc" displayName="aaa" />
      </Fields>
</Root>

How can I do this using XSLT?

Good question, +1. See my answer for a short, easy and efficient XSLT 1.0 solution. — Dimitre Novatchev
– Dimitre Novatchev, Commented May 9, 2011 at 13:19

Community · Accepted Answer · 2017-05-23 11:45:33Z

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kFieldByName" match="Field"
  use="concat(@name, '+', @displayName)"/>

 <xsl:template match=
  "Field[generate-id()
        =
         generate-id(key('kFieldByName',
                     concat(@name, '+', @displayName)
                     )[2])
        ]
  ">
     <xsl:copy-of select=
     "key('kFieldByName',concat(@name, '+', @displayName))"/>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<Root>
    <Fields>
        <Field name="abc" displayName="aaa" />
        <Field name="pqr" displayName="ppp" />
        <Field name="abc" displayName="aaa" />
        <Field name="xyz" displayName="zzz" />
    </Fields>
</Root>

produces the wanted result:

<Field name="abc" displayName="aaa"/>
<Field name="abc" displayName="aaa"/>

Explanation:

Muenchian grouping using composite key (on the name and displayName attributes).
The only template in the code matches any Field element that is the second in its corresponding group. Then, inside the body of the template, the whole group is output.
Muenchian grouping is the efficient way to do grouping in XSLT 1.0. Keys are used for efficiency.
See also my answer to this question.

II. XSLT 2.0 solution:

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
     <xsl:for-each-group select="/*/*/Field"
          group-by="concat(@name, '+', @displayName)">
       <xsl:sequence select="current-group()[current-group()[2]]"/>
   </xsl:for-each-group>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document (shown above), again the wanted, correct result is produced:

<Field name="abc" displayName="aaa"/>
<Field name="abc" displayName="aaa"/>

Explanation:

Use of <xsl:for-each-group>
Use of the current-group() function.

Jeff Yates · Accepted Answer · 2011-05-09 13:00:18Z

1

To find duplicates, you need to iterate the Field elements and for each one, look for the set of Field elements in the whole document that have matching name and displayName attribute values. If the set has more than 1 element, you add that element into the output.

Here is an example of a template that achieves this:

<xsl:template match="Field">
    <xsl:variable name="fieldName" select="@name" />
    <xsl:variable name="fieldDisplayName" select="@displayName" />
    <xsl:if test="count(//Field[@name=$fieldName and @displayName=$fieldDisplayName]) > 1">
        <xsl:copy-of select="."/>
    </xsl:if>
</xsl:template>

Executing this template (wrapped in an appropriate XSLT file) on your sample data gives the following output:

<?xml version="1.0" encoding="utf-8"?>
<Root>
  <Fields>
    <Field name="abc" displayName="aaa" />
    <Field name="abc" displayName="aaa" />
  </Fields>
</Root>

answered May 9, 2011 at 13:00

Jeff Yates

62.6k20 gold badges144 silver badges193 bronze badges

6 Comments

Dimitre Novatchev Over a year ago

@Jeff Yates: This is one possible solution, however its efficiency is O(N^2) and it is too slow to be used on XML documents with a large number of Field elements. See my answer for an efficient solution.

Jeff Yates Over a year ago

@Dimitre: Seems silly to do more effort than necessary. There is no reason to believe the real XML would be huge and there is no profiling information. I'd go for quick to write over quick to run any day until the profiling is in.

Dimitre Novatchev Over a year ago

@Jeff Yates: One can and should use the known most-efficient solutions. Because people think otherwise we encounter everyday's problems about a transformation running 40 minutes and when refactored with Muenchian grouping then taking only 2 seconds. We should not propagate bad and naive algorithms.

Jeff Yates Over a year ago

@Dimitre: You are right although one should also consider the cost of implementation and maintenance when optimizing up front.

Michael Kay Over a year ago

While the efficiency might be O(N^2) on many XSLT processors, it might be much better on an optimizing processor - try it on Saxon-EE. However, I agree it's best not to place too heavy a reliance on the optimizer - use xsl:for-each-group.

|

Collectives™ on Stack Overflow

How to output duplicate elements using XSLT?

2 Answers 2

Comments

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related