2

I want to remove duplicates when all variables are exact matches using xslt.

In this xml node 3 should be removed because it is a perfect copy of node 1.

<root> 
    <trips> 
      <trip> 
        <got_car>0</got_car> 
        <from>Stockholm, Sweden</from> 
        <to>Gothenburg, Sweden</to> 
        <when_iso>2010-12-06 00:00</when_iso> 
      </trip>
      <trip> 
        <got_car>0</got_car> 
        <from>Stockholm, Sweden</from> 
        <to>New york, USA</to> 
        <when_iso>2010-12-06 00:00</when_iso> 
      </trip>
      <trip> 
        <got_car>0</got_car> 
        <from>Stockholm, Sweden</from> 
        <to>Gothenburg, Sweden</to> 
        <when_iso>2010-12-06 00:00</when_iso> 
      </trip>
      <trip> 
        <got_car>1</got_car> 
        <from>Test, Duncan, NM 85534, USA</from> 
        <to>Test, Duncan, NM 85534, USA</to> 
        <when_iso>2010-12-06 00:00</when_iso> 
      </trip> 
    <trips> 
<root>
1

3 Answers 3

2

With a better desing, this stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kTripByContent" match="trip"
             use="concat(got_car,'+',from,'+',to,'+',when_iso)"/>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="trip[generate-id() !=
                              generate-id(key('kTripByContent',
                                              concat(got_car,'+',
                                                     from,'+',
                                                     to,'+',
                                                     when_iso))[1])]"/>
</xsl:stylesheet>

Output:

<root>
    <trips>
        <trip>
            <got_car>0</got_car>
            <from>Stockholm, Sweden</from>
            <to>Gothenburg, Sweden</to>
            <when_iso>2010-12-06 00:00</when_iso>
        </trip>
        <trip>
            <got_car>0</got_car>
            <from>Stockholm, Sweden</from>
            <to>New york, USA</to>
            <when_iso>2010-12-06 00:00</when_iso>
        </trip>
        <trip>
            <got_car>1</got_car>
            <from>Test, Duncan, NM 85534, USA</from>
            <to>Test, Duncan, NM 85534, USA</to>
            <when_iso>2010-12-06 00:00</when_iso>
        </trip>
    </trips>
</root>
Sign up to request clarification or add additional context in comments.

Comments

1

This code:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>

<xsl:key name="trip-tth" match="/root/trips/trip" use="concat(got_car, '+', from, '+', to, '+', when_iso)"/>

<xsl:template match="root/trips">   
    <xsl:copy>
        <xsl:apply-templates select="trip[generate-id(.) = generate-id( key ('trip-tth', concat(got_car, '+', from, '+', to, '+', when_iso) ) )]"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="trip">
    <xsl:copy-of select="."/>
</xsl:template>

</xsl:stylesheet>

Will do the trick.

It utilizes the fact that generate-id() applied to a key will take the id of the first node, that matches a given criteria. And in our case criteria is concatenated value of each trip child element.

6 Comments

@Flack: Generally your answer isn't entirely correct. Whenever using concatenation as key, it is safe and necessary to include in this concatenation separating strings, in order to avoid considering concat('A','BC') and concat('AB','C') to be the "same".
Actually, could you explain how I put the new data in a for-each-array?
@Dimitre Novatchev, thanks for useful remark. I've edited my response for future readers.
@Kristoffer Nolgren, if I got your point right, you want to use for-each instead of apply-templates? Well, xpath expression will be the same.
@Flack: This answer has value because of the key concatenation (once it follows @Dimitre remarks...), but the rest of the stylesheet has many faults: absolute path for key/@match pattern, loosing the root element, push style template applying for later explicit xsl:copy...
|
1

If you are using XSLT 1.0, this answer may help: How to remove duplicate XML nodes using XSLT. It is easier with XSLT 2.0 but that is not universally deployed

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.