4

I would like to use XSLT to transform some XML into JSON.
The XML looks like the following:

<DATA_DS>
    <G_1>
        <ORGANIZATION_NAME>My Company 1</ORGANIZATION_NAME>
        <ORGANIZATIONID>901</ORGANIZATIONID>
        <ITEMNUMBER>20001</ITEMNUMBER>
        <ITEMDESCRIPTION>Item Description 1</ITEMDESCRIPTION>
    </G_1>
    <G_1>
        <ORGANIZATION_NAME>My Company 1</ORGANIZATION_NAME>
        <ORGANIZATIONID>901</ORGANIZATIONID>
        <ITEMNUMBER>20002</ITEMNUMBER>
        <ITEMDESCRIPTION>Item Description 2</ITEMDESCRIPTION>
    </G_1>
    <G_1>
        <ORGANIZATION_NAME>My Company 1</ORGANIZATION_NAME>
        <ORGANIZATIONID>901</ORGANIZATIONID>
        <ITEMNUMBER>20003</ITEMNUMBER>
        <ITEMDESCRIPTION>Item Description 3</ITEMDESCRIPTION>
    </G_1>
</DATA_DS>

I expect the JSON to look like the following:

    [
        {
            "Item_Number":"20001",
            "Item_Description":"Item Description 1"
        },
        {
            "Item_Number":"20002",
            "Item_Description":"Item Description 2"
        },
        {
            "Item_Number":"20003",
            "Item_Description":"Item Description 3"
        }
    ]

What is the recommended way to do this?

I am considering two approaches:

  1. Try using the fn:xml-to-json function, as defined at https://www.w3.org/TR/xpath-functions-31/#func-xml-to-json. But as I understand, the input XML must follow a specific format defined at: https://www.w3.org/TR/xpath-functions-31/schema-for-json.xsd. And I also need the field names in the output JSON to be specifically "Item_Number" and "Item_Description".

  2. Manually code the bracket and brace characters, "[", "]", "{", and "}", along with the field names "Item_Number" and "Item_Description". Then use a standard function to list the values and ensure that any special characters are handled properly. For example, the "&" character should appear normally in the JSON output.

What is the recommended way to do this, or is there a better way that I have not considered?

2
  • The question is tagged xslt-2.0 - but JSON transformations require XSLT 3.0. Commented Sep 25, 2019 at 19:09
  • Thanks for pointing this out. I tried the new code in my actual environment, and I confirmed that xml-to-json was able to run properly. Commented Sep 25, 2019 at 20:40

3 Answers 3

4

I would take the first approach, but start with transforming the given input to the XML format expected by the xml-to-json() function. This could be something like:

XSLT 3.0

<xsl:stylesheet version="3.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/2005/xpath-functions">
<xsl:output method="text" encoding="UTF-8"/>

<xsl:template match="/G_1">
    <!-- CONVERT INPUT TO XML FOR JSON -->
    <xsl:variable name="xml">
        <array>
            <xsl:for-each-group select="*" group-starting-with="ORGANIZATION_NAME">
                <map>
                    <string key="Item_Number">
                        <xsl:value-of select="current-group()[self::ITEMNUMBER]"/>
                    </string>
                    <string key="Item_Description">
                        <xsl:value-of select="current-group()[self::ITEMDESCRIPTION]"/>
                    </string>
                </map>
            </xsl:for-each-group>
        </array>
    </xsl:variable>
    <!-- OUTPUT -->
    <xsl:value-of select="xml-to-json($xml)"/>
</xsl:template>

</xsl:stylesheet>

Demo: https://xsltfiddle.liberty-development.net/bFWR5DQ

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, michael.hor257k. Actually, I realized later that my original XML had only one set of G_1 tags, and it was missing the DATA_DS tags on the outside. But I was able to change your solution to match the revised XML input.
Here is the original XML before I revised it:<G_1> <ORGANIZATION_NAME>My Company 1</ORGANIZATION_NAME><ORGANIZATIONID>901</ORGANIZATIONID><ITEMNUMBER>20001</ITEMNUMBER><ITEMDESCRIPTION>Item Description 1</ITEMDESCRIPTION> <ORGANIZATION_NAME>My Company 1</ORGANIZATION_NAME><ORGANIZATIONID>901</ORGANIZATIONID><ITEMNUMBER>20002</ITEMNUMBER><ITEMDESCRIPTION>Item Description 2</ITEMDESCRIPTION> <ORGANIZATION_NAME>My Company 1</ORGANIZATION_NAME><ORGANIZATIONID>901</ORGANIZATIONID><ITEMNUMBER>20003</ITEMNUMBER><ITEMDESCRIPTION>Item Description 3</ITEMDESCRIPTION> </G_1>
1

For simple mappings like that you can also directly construct XPath 3.1 arrays and maps i.e. in this case an array of maps:

  <xsl:template match="DATA_DS">
      <xsl:sequence select="array { G_1 ! map { 'Item_Number' : string(ITEMNUMBER), 'Item_Description' : string(ITEMDESCRIPTION) } }"/>
  </xsl:template>

Then serialize as JSON with <xsl:output method="json" indent="yes"/>: https://xsltfiddle.liberty-development.net/ejivdGS

The main disadvantage is that maps have no order so you can't control the order of the items in a map, for instance for that example and the used Saxon version Item_Description is output before Item_Number.

But in general transforming to the format for xml-to-json provides more flexibility and also allows you to control the order as the function preserves the order in the XML representation of JSON.

3 Comments

Note that Saxon has an extension <xsl:output saxon:property-order="Item_Number Item_Description"/> which can be used to control the order of properties during JSON serialization.
@MichaelKay The order of the properties has no meaning to any system that would use that JSON object -- as the order of map entries also makes little sense and is not guaranteed. When we want to preserve a given ordering, the way to do this is to use arrays -- both in a JSON object or in the value of a map entry
The order makes no difference to software that's reading the data, but it makes a huge difference to any human readers. I added this feature because I have a vocabulary where objects typically have 9 simple-valued properties and 1 tree-valued property (rather like attributes and children in XML..), and finding your way around the data is vastly easier if the tree-valued property is output last. We recognise the need for indentation, after all, for human readability, and I found that consistent property order is just as important for readability as indentation.
0

This is the result of taking the solution posted by michael.hor257k and applying it to my revised input XML:

<xsl:stylesheet version="3.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns="http://www.w3.org/2005/xpath-functions">
    <xsl:output method="text" encoding="UTF-8"/>

    <xsl:template match="/DATA_DS">
        <!-- CONVERT INPUT TO XML FOR JSON -->
        <xsl:variable name="xml">
            <array>
                <xsl:for-each  select="G_1">
                <map>
                    <string key="Item_Number">
                        <xsl:value-of select="ITEMNUMBER"/>
                    </string>
                    <string key="Item_Description">
                        <xsl:value-of select="ITEMDESCRIPTION"/>
                    </string>
                </map>
            </xsl:for-each>
            </array>
        </xsl:variable>
        <!-- OUTPUT -->
         <xsl:value-of select="xml-to-json($xml)"/>
    </xsl:template>

</xsl:stylesheet>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.