0

I have this xml file that all the rest of the nodes depends on the first one. I created a tree diagram to help illustrate it.

XML Data Schema

The problem I am having is that each node depends on the first one but they do not have any columns that link each other.

Here is the Link to the XML Document

Here is my sql so far

SELECT
        B.RD.query('racedata/todays_cls').value('.','varchar(max)') AS todays_cls,
        B.RD.query('racedata/horsedata/horse_name').value('.', 'varchar(max)') AS horse_name,
        B.RD.query('racedata/horsedata/jockey/jock_disp').value('.', 'varchar(max)') AS jockeyname
        FROM @xmlData.nodes('data') AS B(RD)

Updated SQL Query

0

1 Answer 1

1

Do the second CROSS APPLY on the result of the first one, so that you get <horsedata> elements that corresponds to current <racedata>. See a working demo example below.

sample data :

declare @xml XML = '
<data xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.trackmaster.com/xmlSchema/ppXMLData.xsd">  
  <racedata> 
    <race>1</race>
    <todays_cls>59</todays_cls>
    <distance>800.0</distance>
    <country>USA</country>  
    <horsedata> 
      <horse_name>BROADWAY KATE</horse_name>
      <owner_name>C. Steve Larue</owner_name>
      <program>1</program>
      <pp>1</pp>
      <weight>121</weight>
    </horsedata> 
    <horsedata> 
      <horse_name>UNFAITHFUL</horse_name>
      <owner_name>Melson, James L., Thomson, Ramona and Connell</owner_name>
      <program>2</program>
      <pp>2</pp>
      <weight>121</weight>
    </horsedata> 
  </racedata> 
  <racedata> 
      <race>2</race>
      <todays_cls>87</todays_cls>
      <distance>800.0</distance>
      <country>USA</country>  
      <horsedata> 
        <horse_name>MAGNETIC START</horse_name>
        <owner_name>Vernon D. Coyle</owner_name>
        <program>1</program>
        <pp>1</pp>
        <weight>121</weight>
      </horsedata>  
      <horsedata> 
        <horse_name>SKI POLE</horse_name>
        <owner_name>Downunder Cable, LLC</owner_name>
        <program>2</program>
        <pp>2</pp>
        <weight>121</weight>
      </horsedata> 
    </racedata>
</data>
'

query :

SELECT 
    rd.value('race[1]', 'int') AS race,
    rd.value('todays_cls[1]','int') AS todays_cls,
    hd.value('pp[1]','int') AS pp,
    hd.value('weight[1]','int') AS weight
FROM @xml.nodes('data/racedata') AS B(RD)
CROSS APPLY RD.nodes('horsedata') AS C(HD)

output :

enter image description here


UPDATE

In response to the updated query, you should've shredded on horsedata element level as suggested in the above query, something like this :

SELECT
    B.RD.query('todays_cls').value('.','varchar(max)') AS todays_cls,
    C.HD.query('horse_name').value('.', 'varchar(max)') AS horse_name,
    C.HD.query('jockey/jock_disp').value('.', 'varchar(max)') AS jockeyname
FROM @xml.nodes('data/racedata') AS B(RD)
    CROSS APPLY RD.nodes('horsedata') AS C(HD)
Sign up to request clarification or add additional context in comments.

5 Comments

your code works however when you add data from Race 2 it does not show as Race 2. I edited my question to show you.
@DanNick added the second race to sample data, query remains untouched. Still showing the expected output..
ok I see what I am doing wrong. Add this to the equation: CROSS APPLY RD.nodes('horsedata/jockey') AS D(JD) with a select statement of JD.value('jock_disp[1]','varchar(100)') AS jockey That is when the data shows the wrong information
@DanNick see UPDATE section
Thank you so much for your assistance and patience. I have been working for the past two days on getting this script right.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.