0

I am trying to understand how XSLT can iterate over multiple elements and output data from the those elements and their older and younger siblings. See the example below:

<a>
    <b name="first">
        <c value="C Value">C Text</c>
        <d>D1 Text</d>
        <d>D2 Text</d>
        <e>E Text</e>
    </b>
    <b name="second">
        <c value="C Value">C Text</c>
        <d>D1 Text</d>
        <d>D2 Text</d>
        <d>D3 Text</d>
        <e>E Text</e>
    </b>
    <b name="third">
        <c value="C Value">C Text</c>
        <e>E Text</e>
    </b>
</a>

I would like the output to be like the following (assuming for simplicity that each element's text will not contain commas).

first,C Value,C Text,D1 Text,E Text
first,C Value,C Text,D2 Text,E Text
second,C Value,C Text,D1 Text,E Text
second,C Value,C Text,D2 Text,E Text
second,C Value,C Text,D3 Text,E Text
third,C Value,C Text,,E Text

So there could be an arbitrary number of <d> elements (or none at all). Each CSV line must contain the information from <c> (the older sibling of <d>), the information from one instance of <d>, and the information from <e>, the younger sibling of <d>. (I may be making up this notion of younger and older, but it seems to make sense.)

I haven't worked with XSLT before, so I really don't know where to start with it. The examples I've found for iterating over elements do not make it clear how I can pull values from later in the document (e.g. <e> in the example) and return to another instance of the element (e.g. <d>). I've written an implementation of this in Python with lxml, but would like to see if XSLT is better suited for this transformation.

EDIT: I just saw that an answer was posted which I will be studying in detail to try to understand it. But just to share what I've been working on, here is the XSL that I have been developing. It fails to output anything when there is no <d> value.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

    <xsl:output method="text" encoding="iso-8859-1" />
    <xsl:strip-space elements="*" />

    <xsl:template match="a">       
        <xsl:for-each select="b">
            <xsl:call-template name="match-b"/>
        </xsl:for-each>
        <xsl:text>&#10;</xsl:text>
    </xsl:template>

    <xsl:template name="match-b" match="b">
        <xsl:for-each select="d">
            <xsl:value-of select="../@name"/>
            <xsl:text>,</xsl:text>
            <xsl:value-of select="../c/@value"/>
            <xsl:text>,</xsl:text>
            <xsl:value-of select="../c"/>
            <xsl:text>,</xsl:text>
            <xsl:value-of select="."/>
            <xsl:text>,</xsl:text>
            <xsl:value-of select="../e"/>
            <xsl:text>&#10;</xsl:text>
        </xsl:for-each>
    </xsl:template>

</xsl:stylesheet>

Output:

first,C value,C Text,D1 Text,E Text
first,C value,C Text,D2 Text,E Text
second,C value,C Text,D1 Text,E Text
second,C value,C Text,D2 Text,E Text
second,C value,C Text,D3 Text,E Text
1
  • The expression language used within XSLT is XPath so start with a tutorial on XPath, then you know the terminology and how to navigate from one node to another. For an example of XML to CSV see stackoverflow.com/questions/50369518/… for instance. Your problem, however, has the difficulty that you seem to want to create data from non-existent d elements, that would require an intermediary step that create a d within those b[not(d)] elements. Commented May 20, 2018 at 16:10

1 Answer 1

2

You can navigate to preceding or following sibling or the parent node using XPath. To output a line with XSLT 2 or later you can use xsl:value-of select="expression-to-compute-column-values" separator=",", as in

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="3.0">

  <xsl:output method="text"/>

  <xsl:mode name="add-dummy-d" on-no-match="shallow-copy"/>

  <xsl:variable name="normalized-input">
      <xsl:apply-templates mode="add-dummy-d"/>
  </xsl:variable>

  <xsl:template match="b[not(d)]/c" mode="add-dummy-d">
      <xsl:next-match/>
      <d/>
  </xsl:template>

  <xsl:template match="/">
      <xsl:apply-templates select="$normalized-input/a/b/d"/>
  </xsl:template>

  <xsl:template match="d">
      <xsl:value-of select="../@name, preceding-sibling::c/(@value, .), ., following-sibling::e" separator=","/>
      <xsl:text>&#10;</xsl:text>
  </xsl:template>

</xsl:stylesheet>

As said in a comment, an intermediary step adding an empty d for those b that don't have is needed or at least one way to ensure you get the last row of your sample output.

Online sample at https://xsltfiddle.liberty-development.net/eiZQaF7.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.