0

I'm trying to perform XSL 1.0 transformations on a set of XML files exported and transformed elsewhere that have duplicate nodes - I'm able to remove identical duplicate nodes, but not those with different values/attributes in them. . What I'm trying to achieve is to retain only the second set of error nodes. Any help in understanding where I'm going wrong is appreciated!

A set of XML files have data like this:

<row xmlns="http://www.example.com/abc/xyz" xmlns:dg="http://www.example.com/abc/def" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <data>
    <status>Y</status>
    <product>48530</product>
    <id>12312343</id>
    <error xmlns="">true</error>
    <errorReason xmlns="">Detailed error message</errorReason>
    <error xmlns="">true</error>
    <errorReason xmlns="">Detailed error message</errorReason>
  </data>
</row>

When using the following XSL, the duplicates are removed:

<xsl:stylesheet version="1.0" exclude-result-prefixes="xsi d dg" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:d="http://www.example.com/abc/xyz" 
xmlns:dg="http://www.example.com/abc/def" >
<xsl:output omit-xml-declaration="yes" method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="comment()"/>

  <!-- Drill down into the export XML and extract only the main table row data -->
  <xsl:template match="d:row">
    <xsl:apply-templates select="d:data"/>
  </xsl:template>

  <xsl:template match="error[preceding::error]"/>
  <xsl:template match="errorReason[preceding::errorReason]"/>

</xsl:stylesheet>

However, when I try the same XSL for a set of XML files with data like this:

<row xmlns="http://www.example.com/abc/xyz" xmlns:dg="http://www.example.com/abc/def" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <data>
    <status>Y</status>
    <product>130160072014</product>
    <dob>11/11/1911</dob>
    <id>12312312</id>
    <error>false</error>
    <errorReason />
    <error xmlns="">true</error>
    <errorReason xmlns="">Detailed error message</errorReason>
  </data>
</row>

nothing's happening.

I suspect the empty xmlns maybe the cause, but am not too sure.

5
  • 1
    Please post minimal but complete snippets of XML and XSLT allowing us to reproduce the problem. Are there any namespace declarations on any ancestor or parent elements you have not shown? What does the rest of the XSLT do? Commented Nov 15, 2015 at 18:42
  • In the test utilities-online.info/xsltransformation/… the duplicate error and errorReason elements are removed by the XSLT code using your two templates plus the identity transformation template. Commented Nov 15, 2015 at 19:17
  • What is your desired result? Which error node do you want to keep? The first in order? Commented Nov 15, 2015 at 19:39
  • 1
    We can only tell you where you are going wrong if you tell us where you are going: post some code. Commented Nov 15, 2015 at 21:59
  • Apologies @MartinHonnen et all, I posted the question in a late night/early morning sleep and coffee deprived state. I've updated the code samples, and clarified the issue/resolution. Hope this is better. Commented Nov 16, 2015 at 0:57

1 Answer 1

1

This is because of namespaces. xmlns is a namespace declaration. In your first XML the error and errorreason elements all have xmlns="" declared which means they are all in no namespace.

However, in your second XML you do this:

<error>false</error>
<errorReason />
<error xmlns="">true</error>
<errorReason xmlns="">Detailed error message</errorReason>

The first error and errorReason are don't have an explicit xmlns on, which means they are in the default namespace which was defined on the row element

 <row xmlns="http://www.example.com/abc/xyz" 

The declaration applies to not just the row element, but its descendants as well, unless overridde.

This means the first error and errorReason are in a different namespace to the other two (which aren't actually in a namespace), and so they are effectively different. They are not matched by your XSLT template, as the template is only matching the elements in no namespace.

You haven't said which pair of elements you wish to retain. The ones in a namespace, or the ones without. However, if you did really want to remove "duplicates" regardless of namespaces, you could use these two templates, which just ignores the namespaces altogether (and so will retain the first elements, which are in the namespace in your case).

<xsl:template match="*[local-name() = 'error'][preceding::*[local-name() = 'error']]"/>
<xsl:template match="*[local-name() = 'errorReason'][preceding::*[local-name() = 'errorReason']]"/>
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you Tim. You're answer is very helpful. Once I added in the namespace reference, the template worked.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.