0

Fairly new to XML parsing in SQL Server. Here's what I have and what I'm trying to do.

I have a table with many rows similar to this:

+-------------------+------------------------------------+
| EDI_Assessment_ID |              XML_TEXT              |
+-------------------+------------------------------------+
|             12345 | text column containing XML         |
|             12346 | text column containing XML         |
+-------------------+------------------------------------+

The XML_Text column has a large XML text similar to this structure (i've simplified and only pasted the relevant portions of it:

<Assessment>
<ADLs>
  <ADL_Group>
     <ADL>bathing</ADL>
     <Mapped_ADL Source="Calypso">Bathing</Mapped_ADL>
     <ADL_Level>Requires only equipment to complete ADL</ADL_Level>
     <Mapped_ADL_Level Source="Calypso">Independent</Mapped_ADL_Level>
     <ADL_Equipment>HH shower</ADL_Equipment>
     <ADL_Assisted_By_Info>
        <ADL_Assisted_By>No one</ADL_Assisted_By>
     </ADL_Assisted_By_Info>
  </ADL_Group>
  <ADL_Group>
     <ADL>Continence-Bowel</ADL>
     <Mapped_ADL Source="Calypso">Continence</Mapped_ADL>
     <ADL_Level>Independent</ADL_Level>
     <Mapped_ADL_Level Source="B/A">Independent</Mapped_ADL_Level>
     <ADL_Equipment />
     <ADL_Assisted_By_Info>
        <ADL_Assisted_By>No one</ADL_Assisted_By>
     </ADL_Assisted_By_Info>
  </ADL_Group>
</Assessment>

How can i parse through the XML for each row in the table to return:

  1. The ADL (bathing, Continence-Bowel) and
  2. the ADL_Assisted_By_Info

I'm looking for the result set to return similar to this:

+-------------------+-------------+----------------------+------------------+----------------------+
| EDI_Assessment_ID |   Bathing   | ADL_Assisted_By_Info | Continence-Bowel | ADL_Assisted_By_Info |
+-------------------+-------------+----------------------+------------------+----------------------+
|             12345 | Independent | No one               | Independent      | No one               |
+-------------------+-------------+----------------------+------------------+----------------------+
7
  • Out of interest, why are you using varchar(MAX) to store xml and not the xml datatype? You can't use XQUERY on a varchar, so that's automatically making this task harder that it needs to be. if the reason is because the XML isn't valid xml (so, when trying to store it as an xml you get an error), then this is far far harder. Ideally you should be fixing the XML first so that it's valid. Commented Nov 13, 2018 at 15:01
  • whoops, that's a mistake in my question. it's actually a "text" data type....just FYI, I'm a data analyst so any table design questions I won't be able to answer. I'll edit my question with the correct data type! Commented Nov 13, 2018 at 15:07
  • That's even worse, text has been deprecated since at least SQL Server 2005. That doesn't answer by question though. Why are you using varchar(MAX)/text? There's an xml datatype and XML should be stored as xml. Commented Nov 13, 2018 at 15:08
  • No idea...I'm just using what I've been given. Sure sounds like whoever designed this table could have done it better. Commented Nov 13, 2018 at 15:10
  • Is changing the data type an option? Otherwise you'll have to CONVERT the value of XML_TEXT to xml every time you want to use XQUERY against it. That's going to slow the query down, as the data engine will have to validate every row you're addressing to ensure it is valid XML. Commented Nov 13, 2018 at 15:12

1 Answer 1

1

These solutions rely on something which, from the OP's sample data is not true; that the column of the datatype text has valid XML. The sample data is not, so this solution will not work against the sample data they have provided.

In fact, if all of the OP's sample data is poorly formed XML then they SQL Server is completely the wrong choice here. They should, ideally, be fixing their data first, and then changing the datatype to xml so that more bad XML can't in inserted into the database.

If, for whatever reason, they can't do that then they will need to find a different solution. SQL Server, however, isn't the solution. You're going to need something that is very good as string manipulation and work out the values that way. if you're doing this at a (large) dataset value then the process is probably going to slow down to a crawl.

Anyway, onto the point. Note the comment. There are 2 solution, the first, other than the validity, assumes that the bathing node is always the first ADL_GROUP element, and that Continence-Bowel is always the second:

WITH VTE AS(
    SELECT 12345 AS ID,
           CONVERT(text,
'<Assessment>
<ADLs>
  <ADL_Group>
     <ADL>bathing</ADL>
     <Mapped_ADL Source="Calypso">Bathing</Mapped_ADL>
     <ADL_Level>Requires only equipment to complete ADL</ADL_Level>
     <Mapped_ADL_Level Source="Calypso">Independent</Mapped_ADL_Level>
     <ADL_Equipment>HH shower</ADL_Equipment>
     <ADL_Assisted_By_Info>
        <ADL_Assisted_By>No one</ADL_Assisted_By>
     </ADL_Assisted_By_Info>
  </ADL_Group>
  <ADL_Group>
     <ADL>Continence-Bowel</ADL>
     <Mapped_ADL Source="Calypso">Continence</Mapped_ADL>
     <ADL_Level>Independent</ADL_Level>
     <Mapped_ADL_Level Source="B/A">Independent</Mapped_ADL_Level>
     <ADL_Equipment />
     <ADL_Assisted_By_Info>
        <ADL_Assisted_By>No one</ADL_Assisted_By>
     </ADL_Assisted_By_Info>
  </ADL_Group>
</ADLs>' + --I have added this line to make the XML valid. The sample you have will NOT work, as it is not valid XML
'</Assessment>') AS XML_Text
)
SELECT V.ID,
       X.XML_Type,
       T.AA.value('(ADL_Group/Mapped_ADL_Level/text())[1]','varchar(30)') AS Bathing,
       T.AA.value('(ADL_Group/ADL_Assisted_By_Info/ADL_Assisted_By/text())[1]','varchar(30)') AS ADL_Assisted_By_Info,
       T.AA.value('(ADL_Group/Mapped_ADL_Level/text())[2]','varchar(30)') AS ContinenceBowel,
       T.AA.value('(ADL_Group[2]/ADL_Assisted_By_Info/ADL_Assisted_By/text())[1]','varchar(30)') AS ADL_Assisted_By_Info
FROM VTE V
     CROSS APPLY (VALUES(TRY_CONVERT(xml, V.XML_Text))) X(XML_Type)
     CROSS APPLY X.XML_Type.nodes('/Assessment/ADLs') T(AA);

If, however, that isn't true and there could be other nodes in play, with different values, then you could do the following for the SELECT (CTE not included):

SELECT V.ID,
       X.XML_Type,
       B.AG.value('(Mapped_ADL_Level/text())[1]','varchar(30)') AS Bathing,
       B.AG.value('(ADL_Assisted_By_Info/ADL_Assisted_By/text())[1]','varchar(30)') AS ADL_Assisted_By_Info,
       CB.AG.value('(Mapped_ADL_Level/text())[1]','varchar(30)') AS ContinenceBowel,
       CB.AG.value('(ADL_Assisted_By_Info/ADL_Assisted_By/text())[1]','varchar(30)') AS ADL_Assisted_By_Info
FROM VTE V
     CROSS APPLY (VALUES(TRY_CONVERT(xml, V.XML_Text))) X(XML_Type)
     CROSS APPLY X.XML_Type.nodes('/Assessment/ADLs/ADL_Group') B(AG)
     CROSS APPLY X.XML_Type.nodes('/Assessment/ADLs/ADL_Group') CB(AG)
WHERE B.AG.value('(ADL/text())[1]','varchar(30)') = 'bathing'
  AND CB.AG.value('(ADL/text())[1]','varchar(30)') = 'Continence-Bowel';
Sign up to request clarification or add additional context in comments.

1 Comment

Larnu - thank you for taking the time to explain. This is plenty enough to get me up and running (and I'll take your advice to see if the correct persons can change the datatype!)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.