2

I have working code that can select a few values from an XML file. The problem is that I have multiple nodes with the same name.

Here is a snippet of the XML:

<wd:Report_Data xmlns:wd="urn:com.workday.report/Countries_and_Their_Address_Components_Summary">
  <wd:Report_Entry>
    <wd:Country wd:Descriptor="Afghanistan">
      <wd:ID wd:type="WID">db69b722446c11de98360015c5e6daf6</wd:ID>
      <wd:ID wd:type="ISO_3166-1_Alpha-2_Code">AF</wd:ID>
      <wd:ID wd:type="ISO_3166-1_Alpha-3_Code">AFG</wd:ID>
      <wd:ID wd:type="ISO_3166-1_Numeric-3_Code">4</wd:ID>
    </wd:Country>
    <wd:Address_Format_Type wd:Descriptor="Basic">
      <wd:ID wd:type="WID">4516bf435611423ea4ee72fa842572a0</wd:ID>
    </wd:Address_Format_Type>
    <wd:Local>1</wd:Local>
    <wd:Address_Components>
      <wd:Address_Component wd:Descriptor="Address Line 1 - Local">
        <wd:ID wd:type="WID">12d859b8df024175a111da2e088250fb</wd:ID>
        <wd:ID wd:type="Address_Component_Type_ID">ADDRESS_LINE_1_LOCAL</wd:ID>
      </wd:Address_Component>
      <wd:Order>a</wd:Order>
      <wd:Required>0</wd:Required>
    </wd:Address_Components>
    <wd:Address_Components>
      <wd:Address_Component wd:Descriptor="Address Line 2 - Local">
        <wd:ID wd:type="WID">85a6ab9412c44dd9a71a7e4760bf17fb</wd:ID>
        <wd:ID wd:type="Address_Component_Type_ID">ADDRESS_LINE_2_LOCAL</wd:ID>
      </wd:Address_Component>
      <wd:Order>b</wd:Order>
      <wd:Required>0</wd:Required>
    </wd:Address_Components>

My SQL is the following:

declare @inputxml table (x xml)

insert @inputxml
select x
from OPENROWSET(BULK 'C:\ParallelTool\addcomp.xml', SINGLE_BLOB) As T(x)

;WITH XMLNAMESPACES(DEFAULT 'urn:com.workday.report/Countries_and_Their_Address_Components_Summary')
    select 
        xmldata.[ISO], xmldata.[Component 1], xmldata.[Component 2], xmldata.[Required]
    into dbo.WD
    from @inputxml
    cross apply (
        select 
            [ISO] = xmldata.value('(Country/ID)[3]', 'VARCHAR(MAX)'),
            [Component 1] = xmldata.value('(Address_Components/Address_Component/ID)[2]', 'VARCHAR(MAX)'),
            [Component 2] = xmldata.value('(Address_Components/Address_Component/ID)[2]', 'VARCHAR(MAX)'),
            [Required] = xmldata.value('(Address_Components/Required)[1]', 'INT')
        from x.nodes('/Report_Data/Report_Entry') Z1(xmldata)
    ) xmldata

Where I can't get what I need is the [Component 2]. I want to basically select ALL of the "Address_Component_Type_ID" in the file, but they are all named the same and under other nodes that are named the same. How can I specify in my SQL to grab all of the Component Types? Thank you for looking!

1 Answer 1

3

Depends what you want to do... If you know there are exactly 2 "Address_Components" that you want to grab, you can modify your query like so:

;WITH XMLNAMESPACES(DEFAULT 'urn:com.workday.report/Countries_and_Their_Address_Components_Summary')
    select 
        xmldata.[ISO], xmldata.[Component 1], xmldata.[Component 2], xmldata.[Required]
    from @inputxml
    cross apply (
        select 
            [ISO] = xmldata.value('(Country/ID)[3]', 'VARCHAR(MAX)'),
            [Component 1] = xmldata.value('(Address_Components/Address_Component/ID)[2]', 'VARCHAR(MAX)'),
            [Component 2] = xmldata.value('(Address_Components[2]/Address_Component/ID)[2]', 'VARCHAR(MAX)'),
            [Required] = xmldata.value('(Address_Components/Required)[1]', 'INT')
        from x.nodes('/Report_Data/Report_Entry') Z1(xmldata)
    ) xmldata

And the results look like this:

ISO   Component 1               Component 2               Required
----- ------------------------- ------------------------- -----------
AFG   ADDRESS_LINE_1_LOCAL      ADDRESS_LINE_2_LOCAL      0

However, if there can be any number of "Address_Components", and you want to grab them into separate records, you can rewrite your query like this:

;WITH XMLNAMESPACES(DEFAULT 'urn:com.workday.report/Countries_and_Their_Address_Components_Summary')
    select 
        [ISO] = Report_Entry.x.value('(Country/ID)[3]', 'VARCHAR(MAX)')
        , [Component] = Address_Components.x.value('(Address_Component/ID)[2]', 'VARCHAR(MAX)')
        , [Required] = Address_Components.x.value('(Required)[1]', 'INT')
    from @inputxml
    cross apply x.nodes('/Report_Data/Report_Entry') Report_Entry(x)
    cross apply Report_Entry.x.nodes('./Address_Components') Address_Components (x)

And the results look like this:

ISO   Component                 Required
----- ------------------------- -----------
AFG   ADDRESS_LINE_1_LOCAL      0
AFG   ADDRESS_LINE_2_LOCAL      0
Sign up to request clarification or add additional context in comments.

3 Comments

This is great. It definitely is a huge step in the right direction of what I need. You cannot tell by the snippet that I linked, but I actually have multiple ISO's. For that reason, when I use cross apply, I am multiplying everything out to include duplicates. Instead of only getting 1300 rows, I am getting 299,000 rows (230 ISO's * 1300 address components). it is essentially cross applying to every node rather than just the ones in the current node. I've been playing with it for a while with no luck. Is there anyway to get it to only cross apply within the current ISO?
To further clarify, it's not exactly duplicates. I worded that incorrectly - they appear to be duplicates. But what it is doing is grabbing every single address component int the file and applying it up to every single ISO, where I only want the address components from WITHIN the respective ISO.
@bheltzel You are right. The way I wrote the cross apply to the Address_Components, it was pulling all address components from the root, instead of just the subelements of the current ISO element. I corrected it now. If it works for you, please mark the answer as correct.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.