1

Take the following simplified XML:

(An excerpt from a very expensive (and large) XML product feed from an 3rd party industry wholesaler data provider - i.e. I have no control over its schema / format / content)

<Company>
    <Code>7786</Code>
    <Brand>
        <!-- /../ -->
        <Groups>
            <Group>
                <!-- /../ -->
                <Group>
                    <!-- /../ -->
                    <Group>
                        <!-- /../ -->
                        <Product>
                            <Pip_code>3623949</Pip_code>
                            <!-- /../ -->
                            <Other_Codes>
                                <Other_Code>
                                    <Code_Description>EAN</Code_Description>
                                    <Code_Value>5013158781351</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>shipper EAN</Code_Description>
                                    <Code_Value>503158781443</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>AMPP</Code_Description>
                                    <Code_Value>19192411000001107</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>AMPP Manf</Code_Description>
                                    <Code_Value>2061801000001104</Code_Value>
                                </Other_Code>
                            </Other_Codes>
                        </Product>
                    </Group>
                </Group>
            </Group>
        </Groups>
    </Brand>
    <Brand>
        <!-- /../ -->
        <Groups>
            <Group>
                <!-- /../ -->
                <Product>
                    <Pip_code>3265725</Pip_code>
                    <!-- /../ -->
                    <Other_Codes>
                        <Other_Code>
                            <Code_Description>Outer EAN</Code_Description>
                            <Code_Value>5013158776531</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>AMPP</Code_Description>
                            <Code_Value>11521811000001106</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>AMPP Manf</Code_Description>
                            <Code_Value>2061801000001104</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>EAN</Code_Description>
                            <Code_Value>5013158776500</Code_Value>
                        </Other_Code>
                    </Other_Codes>
                </Product>
            </Group>
        </Groups>
    </Brand>
</Company>

I have been tasked with producing the following output:

Target output

This is proving to be extremely difficult for two reasons:

  1. You will notice that <product> 3623949 exists within Groups/Group/Group/Group/ (3 levels of Group) whereas <product> 3265725 exists within Groups/Group/ (1 level of Group) - I need to select all <product> nodes regardless of their nesting level (ideally whenever parent node = <Group>)

  2. Note the <Other_Codes> node within <product> - I need to select the repeating child nodes (Code_Description & Code_Value) as two seperate columns, but note also that there could be any number of <Other_Code> nodes within <Other_Codes> and I need to select them all (rendering [i] notation useless)

I've tried my best to get this data using TSQL OPENXML, but the only way I can think of doing it requires me to duplicate code for the various Group/Group/ levels and I also have to hard code a set number of occurrences of Other_Code using [0] notation - which given their variable nature; is wrong.

SELECT  *
FROM    OPENXML (@idoc, 'Company/Brand/Groups/Group/Product',2)
                        WITH (
                          PIP_code CHAR (20) 'Pip_code',
                          OtherCodeType CHAR (20) 'Other_Codes/Other_Code[1]/Code_Description',
                          OtherCodeValue CHAR (30) 'Other_Codes/Other_Code[1]/Code_Value',
                          OtherCodeType2 CHAR (20) 'Other_Codes/Other_Code[2]/Code_Description',
                          OtherCodeValue2 CHAR (30) 'Other_Codes/Other_Code[2]/Code_Value',
                          OtherCodeType3 CHAR (20) 'Other_Codes/Other_Code[3]/Code_Description',
                          OtherCodeValue3 CHAR (30) 'Other_Codes/Other_Code[3]/Code_Value'
                  )

We would prefer to do this in TSQL (as this XML file is already being processed there for other nodes & paths not relevant here), we have SQL Server 2008, 2008 R2 & SQL Server 2014 at our disposal, but non-SQL solutions would also be helpful - we really have hit a brick wall on this one.

1 Answer 1

2

So what you really want to do is have your selection statement be //Other_Codes - that double forward slash says you want every Other_Codes node, regardless of its location in the hierarchy, giving you one row for each Other_Codes node. Then, you can specify that PipCode is ../../Pip_code, assuming of course this hierarchy doesn't change all the time for this portion as well (but how would that even semantically make sense?).

This:

DECLARE @doc xml = '<Company>
    <Code>7786</Code>
    <Brand>
        <!-- /../ -->
        <Groups>
            <Group>
                <!-- /../ -->
                <Group>
                    <!-- /../ -->
                    <Group>
                        <!-- /../ -->
                        <Product>
                            <Pip_code>3623949</Pip_code>
                            <!-- /../ -->
                            <Other_Codes>
                                <Other_Code>
                                    <Code_Description>EAN</Code_Description>
                                    <Code_Value>5013158781351</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>shipper EAN</Code_Description>
                                    <Code_Value>503158781443</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>AMPP</Code_Description>
                                    <Code_Value>19192411000001107</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>AMPP Manf</Code_Description>
                                    <Code_Value>2061801000001104</Code_Value>
                                </Other_Code>
                            </Other_Codes>
                        </Product>
                    </Group>
                </Group>
            </Group>
        </Groups>
    </Brand>
    <Brand>
        <!-- /../ -->
        <Groups>
            <Group>
                <!-- /../ -->
                <Product>
                    <Pip_code>3265725</Pip_code>
                    <!-- /../ -->
                    <Other_Codes>
                        <Other_Code>
                            <Code_Description>Outer EAN</Code_Description>
                            <Code_Value>5013158776531</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>AMPP</Code_Description>
                            <Code_Value>11521811000001106</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>AMPP Manf</Code_Description>
                            <Code_Value>2061801000001104</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>EAN</Code_Description>
                            <Code_Value>5013158776500</Code_Value>
                        </Other_Code>
                    </Other_Codes>
                </Product>
            </Group>
        </Groups>
    </Brand>
</Company>';

DECLARE @idoc int;

exec sp_xml_preparedocument @idoc OUTPUT, @doc;
SELECT  *
FROM    OPENXML(@idoc, '//Other_Code',1)
WITH (
    PipCode CHAR (20) '../../Pip_code',
    Code_Description CHAR(20) 'Code_Description',
    Code_Value CHAR (30) 'Code_Value'
)

Produces this:

PipCode                 Code_Description        Code_Value
-----------------------------------------------------------------
3623949                 EAN                     5013158781351                 
3623949                 shipper EAN             503158781443                  
3623949                 AMPP                    19192411000001107             
3623949                 AMPP Manf               2061801000001104              
3265725                 Outer EAN               5013158776531                 
3265725                 AMPP                    11521811000001106             
3265725                 AMPP Manf               2061801000001104              
3265725                 EAN                     5013158776500                 
Sign up to request clarification or add additional context in comments.

1 Comment

Or someone who's more accustomed to XSLT than SQL ;) glad to help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.