0

I've got an XML file like this and I'm working with SQL 2014 SP2

<?xml version='1.0' encoding='UTF-8'?>
<gwl>
<version>123456789</version>
<entities>
<entity id="1" version="123456789">
    <name>xxxxx</name>
    <listId>0</listId>
    <listCode>Oxxx</listCode>
    <entityType>08</entityType>
    <createdDate>03/03/1993</createdDate>
    <lastUpdateDate>05/06/2011</lastUpdateDate>
    <source>src</source>
    <OriginalSource>o_src</OriginalSource>
    <aliases>
       <alias category="STRONG" type="Alias">USCJSC</alias>
        <alias category="WEAK" type="Alias">'OSKOAO'</alias>
    </aliases>
    <programs>
        <program type="21">prog</program>
    </programs>
    <sdfs>
        <sdf name="OriginalID">9876</sdf>
    </sdfs>
    <addresses>
        <address>
            <address1>1141, SYA-KAYA STR.</address1>
            <country>RU</country>
            <postalCode>1234</postalCode>
        </address>
        <address>
            <address1>90, MARATA UL.</address1>
            <country>RU</country>
            <postalCode>1919</postalCode>
        </address>
    </addresses>
    <otherIds>
        <childId>737606</childId>
        <childId>737607</childId>
    </otherIds>
</entity>
</entities>
</gwl>

I made a script to insert data from the XML to a SQL table. How can I insert child node into a table? I think I should replicate the row for each new child node but i don't know the best way to proceed.

Here is my SQL code

   DECLARE @InputXML XML


   SELECT @InputXML = CAST(x AS XML)
   FROM OPENROWSET(BULK 'C:\MyFiles\sample.XML', SINGLE_BLOB) AS T(x)

    SELECT 
    product.value('(@id)[1]', 'NVARCHAR(10)') id, 
    product.value('(@version)[1]', 'NVARCHAR(14)')  ID
    product.value('(name[1])', 'NVARCHAR(255)') name,
    product.value('(listId[1])', 'NVARCHAR(9)')listId,
    product.value('(listCode[1])', 'NVARCHAR(10)')listCode,
    product.value('(entityType[1])', 'NVARCHAR(2)')entityType,
    product.value('(createdDate[1])', 'NVARCHAR(10)')createdDate,
    product.value('(lastUpdateDate[1])', 'NVARCHAR(10)')lastUpdateDate,
    product.value('(source[1])', 'NVARCHAR(15)')source,
    product.value('(OriginalSource[1])', 'NVARCHAR(50)')OriginalSource,
    product.value('(aliases[1])', 'NVARCHAR(50)')aliases,
    product.value('(programs[1])', 'NVARCHAR(50)')programs,
    product.value('(sdfs[1])', 'NVARCHAR(500)')sdfs,
    product.value('(addresses[1])', 'NVARCHAR(50)')addresses,
    product.value('(otherIDs[1])', 'NVARCHAR(50)')otherIDs

    FROM @InputXML.nodes('gwl/entities/entity') AS X(product)
1
  • Thx for the good question with sample data and own approach, +1 from my side Commented Mar 11, 2020 at 14:01

1 Answer 1

2

You have a lot of different children here...

Just to show the principles:

DECLARE @xml XML=
N'<gwl>
  <version>123456789</version>
  <entities>
    <entity id="1" version="123456789">
      <name>xxxxx</name>
      <listId>0</listId>
      <listCode>Oxxx</listCode>
      <entityType>08</entityType>
      <createdDate>03/03/1993</createdDate>
      <lastUpdateDate>05/06/2011</lastUpdateDate>
      <source>src</source>
      <OriginalSource>o_src</OriginalSource>
      <aliases>
        <alias category="STRONG" type="Alias">USCJSC</alias>
        <alias category="WEAK" type="Alias">''OSKOAO''</alias>
      </aliases>
      <programs>
        <program type="21">prog</program>
      </programs>
      <sdfs>
        <sdf name="OriginalID">9876</sdf>
      </sdfs>
      <addresses>
        <address>
          <address1>1141, SYA-KAYA STR.</address1>
          <country>RU</country>
          <postalCode>1234</postalCode>
        </address>
        <address>
          <address1>90, MARATA UL.</address1>
          <country>RU</country>
          <postalCode>1919</postalCode>
        </address>
      </addresses>
      <otherIds>
        <childId>737606</childId>
        <childId>737607</childId>
      </otherIds>
    </entity>
  </entities>
</gwl>'; 

-The query will fetch some values from several places.
--It should be easy to get the rest yourself...

SELECT @xml.value('(/gwl/version/text())[1]','bigint') AS [version]
      ,A.ent.value('(name/text())[1]','nvarchar(max)') AS [Entity_Name]
      ,A.ent.value('(listId/text())[1]','int') AS Entity_ListId
      --more columns taken from A.ent
      ,B.als.value('@category','nvarchar(max)') AS Alias_Category
      ,B.als.value('text()[1]','nvarchar(max)') AS Alias_Content
      --similar for programs and sdfs
      ,E.addr.value('(address1/text())[1]','nvarchar(max)') AS Address_Address1
      ,E.addr.value('(country/text())[1]','nvarchar(max)') AS Address_Country
      --and so on
FROM @xml.nodes('/gwl/entities/entity') A(ent)
OUTER APPLY A.ent.nodes('aliases/alias') B(als)
OUTER APPLY A.ent.nodes('programs/program') C(prg)
OUTER APPLY A.ent.nodes('sdfs/sdf') D(sdfs)
OUTER APPLY A.ent.nodes('addresses/address') E(addr)
OUTER APPLY A.ent.nodes('otherIds/childId') F(ids);

The idea in short:

  • We read non-repeating values (e.g. version) from the xml variable directly
  • We use .nodes() to return repeating elements as derived sets.
  • We can use a cascade of .nodes() to dive deeper into repeating child elements by using a relativ Xpath (no / at the beginning).

You have two approaches:

  1. Read the XML like above into a staging table (simply by adding INTO #tmpTable before FROM) and proceed from there (will need one SELECT ... GROUP BY for each type of child).
  2. Create one SELECT per type of child, using only one of the APPLY lines and shift the data into specific child tables.

I would tend to the first one.
This allows to do some cleaning, generate IDs, check for business rules, before you shift this into the target tables.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.