1

Case: A table has a field with some XML code.

-- Some XML
'<DTS:ConnectionManager DTS:refId="Package.ConnectionManagers[MTS]" DTS:CreationName="FLATFILE" DTS:DTSID="{296732CC-7D91-4E49-ACD4-384E03BC032E}" DTS:ObjectName="MTS">
    <DTS:PropertyExpression DTS:Name="ConnectionString">@Something</DTS:PropertyExpression>
    <DTS:ObjectData>
        <DTS:ConnectionManager DTS:Format="Delimited" DTS:LocaleID="1033" DTS:HeaderRowDelimiter="_x000D__x000A_" DTS:ColumnNamesInFirstDataRow="True" DTS:RowDelimiter="" DTS:TextQualifier="_x0022_" DTS:CodePage="1252" DTS:ConnectionString="C:\Folder\\File.csv">
            <DTS:FlatFileColumns>
                <DTS:FlatFileColumn DTS:ColumnType="Delimited" DTS:ColumnDelimiter="_x002C_" DTS:MaximumWidth="50" DTS:DataType="129" DTS:TextQualified="True" DTS:ObjectName="MC" DTS:DTSID="{E87E7707-B7F7-4EC6-A2CB-98AD637A3985}" DTS:CreationName="" />
                <DTS:FlatFileColumn DTS:ColumnType="Delimited" DTS:ColumnDelimiter="_x002C_" DTS:DataType="6" DTS:TextQualified="True" DTS:ObjectName="PP" DTS:DTSID="{C7B97962-3B43-40C5-82B1-F6136906CD84}" DTS:CreationName="" />
            </DTS:FlatFileColumns>
        </DTS:ConnectionManager>
    </DTS:ObjectData>
</DTS:ConnectionManager>'
-- Some more XML

Would like to pull out some information and store it as a tabular format.

Desired output

CreationName    ObjectName  ConnectionString        MaximumWidth    DataType    FieldName
FLATFILE        MTS         C:\Folder\\File.csv     50              129         MC
FLATFILE        MTS         C:\Folder\\File.csv     NULL            6           PP

Explanation of connecting input with the output

CreationName - DTS:CreationName from DTS:ConnectionManager. i.e. FLATFILE
ObjectName - DTS:ObjectName from DTS:ConnectionManager. i.e. MTS
ConnectionString - DTS:ConnectionString from DTS:ObjectData\DTS:ConnectionManager. i.e. "C:\Folder\\File.csv"
MaximumWidth - DTS:MaximumWidth from DTS:FlatFileColumns i.e. 50 -- NOTE: MaximumWidth might not always exist
DataType - DTS:DataType from DTS:FlatFileColumns i.e. 129
FieldName - DTS:ObjectName from DTS:FlatFileColumns i.e. MC

Don't really have much experience with XML in SQL Server. (I'll be doing some of my own playing around and post it here if I get somewhere meaningful. :) )

UPDATED XML Example

    <DTS:Executable xmlns:DTS="www.microsoft.com/SqlServer/Dts" DTS:refId="P" DTS:CreationDate="10/01/2015 12:00:00">
  <DTS:ConnectionManagers>
    <DTS:ConnectionManager DTS:refId="Package.ConnectionManagers[FF]" DTS:CreationName="FLATFILE" DTS:DTSID="{123}" DTS:ObjectName="FF">
      <DTS:ObjectData>
        <DTS:ConnectionManager DTS:Format="Delimited" DTS:LocaleID="1033" DTS:HeaderRowDelimiter="_x000D__x000A_" DTS:ColumnNamesInFirstDataRow="True" DTS:RowDelimiter="" DTS:TextQualifier="_x0022_" DTS:CodePage="1252" DTS:ConnectionString="Test.csv">
          <DTS:FlatFileColumns>
            <DTS:FlatFileColumn DTS:ColumnType="Delimited" DTS:ColumnDelimiter="_x002C_" DTS:DataType="11" DTS:TextQualified="True" DTS:ObjectName="TestCN" DTS:DTSID="{012}" DTS:CreationName="" />
          </DTS:FlatFileColumns>
        </DTS:ConnectionManager>
      </DTS:ObjectData>
    </DTS:ConnectionManager>
    <DTS:ConnectionManager DTS:refId="Package.ConnectionManagers[FF2]" DTS:CreationName="FLATFILE" DTS:DTSID="{123}" DTS:ObjectName="FF2">
      <DTS:ObjectData>
        <DTS:ConnectionManager DTS:Format="Delimited" DTS:LocaleID="1033" DTS:HeaderRowDelimiter="_x000D__x000A_" DTS:ColumnNamesInFirstDataRow="True" DTS:RowDelimiter="" DTS:TextQualifier="_x0022_" DTS:CodePage="1252" DTS:ConnectionString="Test2.csv">
          <DTS:FlatFileColumns>
            <DTS:FlatFileColumn DTS:ColumnType="Delimited" DTS:ColumnDelimiter="_x002C_" DTS:DataType="11" DTS:TextQualified="True" DTS:ObjectName="TestCN2" DTS:DTSID="{012}" DTS:CreationName="" />
          </DTS:FlatFileColumns>
        </DTS:ConnectionManager>
      </DTS:ObjectData>
    </DTS:ConnectionManager>
  </DTS:ConnectionManagers>
</DTS:Executable>

3 Answers 3

2

You are not declaring your namespace in your root element so I substituted that. This should be self extracting and run in anything I am guessing 2008 and higher, though I wrote it in 2014. Just pop it into SQL Server Management Studio:

UPDATED 1:45 PM PST:

Thanks to Shnugo for the simplification of the 'With XMLNamespaces'.

DECLARE @XML XML = '
<DTS:Executable xmlns:DTS="www.microsoft.com/SqlServer/Dts" DTS:refId="P" DTS:CreationDate="10/01/2015 12:00:00">
  <DTS:ConnectionManagers>
    <DTS:ConnectionManager DTS:refId="Package.ConnectionManagers[FF]" DTS:CreationName="FLATFILE" DTS:DTSID="{123}" DTS:ObjectName="FF">
      <DTS:ObjectData>
        <DTS:ConnectionManager DTS:Format="Delimited" DTS:LocaleID="1033" DTS:HeaderRowDelimiter="_x000D__x000A_" DTS:ColumnNamesInFirstDataRow="True" DTS:RowDelimiter="" DTS:TextQualifier="_x0022_" DTS:CodePage="1252" DTS:ConnectionString="Test.csv">
          <DTS:FlatFileColumns>
            <DTS:FlatFileColumn DTS:ColumnType="Delimited" DTS:ColumnDelimiter="_x002C_" DTS:DataType="11" DTS:TextQualified="True" DTS:ObjectName="TestCN" DTS:DTSID="{012}" DTS:CreationName="" />
          </DTS:FlatFileColumns>
        </DTS:ConnectionManager>
      </DTS:ObjectData>
    </DTS:ConnectionManager>
  </DTS:ConnectionManagers>
</DTS:Executable>'
;

WITH XMLNAMESPACES (N'www.microsoft.com/SqlServer/Dts' as DTS )
SELECT 
    y.vals.query('.') AS NodesAsExtracted
,   x.vals.value('@DTS:CreationName', 'Varchar(255)') AS CreationName
,   x.vals.value('@DTS:ObjectName', 'Varchar(255)') AS ObjectName
,   y.vals.value('@DTS:ConnectionString', 'Varchar(255)') AS ConnectionString
,   x.vals.value('@DTS:ColumnType', 'Varchar(255)') AS ColumnType
,   x.vals.value('@DTS:MaximumWidth', 'Varchar(255)') AS MaximumWidth
FROM @XML.nodes('/DTS:Executable/DTS:ConnectionManagers/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager') AS y(vals)
    CROSS APPLY @XML.nodes('/DTS:Executable/DTS:ConnectionManagers/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn') AS x(vals)


/*
The key piece is you are extracting data with a namespace, which makes things harder when querying.
You need to repeat certain 'nodes' so there is a syntax for that called originally enough 'nodes' that breaks up a 3d object like xml into multiple bits
I do one for the high level and one for the lower and then cross apply them which really is a whole world into itself I won't mention here
It should be represented as a parent 'x' and the values found 'vals'
I showed an example as is first when I query '('.')' which is everything in essence.
My namespace declaration must match on the xml that exists and the declaration.

more on nodes https://msdn.microsoft.com/en-us/library/ms188282.aspx
more on query https://msdn.microsoft.com/en-us/library/ms191474.aspx
more on value https://msdn.microsoft.com/en-us/library/ms178030.aspx
*/
Sign up to request clarification or add additional context in comments.

11 Comments

thanks buddy. This sln works for the string in the original post. But when I add another "node" (the root level node I guess is the proper term?!), it doesn't give me any output. <DTS:Executable xmlns:DTS="www.microsoft.com/SqlServer/Dts" DTS:refId=...." Does it have anything to with xmlns namespace?
Try to set the root element's name in the last line CROSS APPLY... after the semicolon. The nodes are read along a path (called "XPATH") and this path starts with your root...
If I change the last line to "CROSS APPLY @XML.nodes('declare namespace DTS="DTS"; /DTS:Executable/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager') AS y(vals)", it still doesn't give any output (I also added it the prev line and also no output) :(
Are there two semicolons? Should be 'declare namespace DTS="http://DTS"; /DTS:Executable/...
@007, you can use the XPath-Filter with the square brackets: using "/root[1]/row[3]/item[@attr='test']" would find the item with an attribut called "attr" with the value "test" in the third row. There's much more you can do: certain functions, logical expressions, usage of outer variables [look for sql: variable() or sql:column()] ... Happy coding!
|
2

This is an enhancment to the answer of djangojazz. Don't accept this, it's just a copy (but you may vote up, if you like it ;-) ...

By using WITH XMLNAMESPACES you can avoid the multiple declaration of the namespace:

WITH XMLNAMESPACES (N'http://DTS' as DTS )
SELECT 
    x.vals.query('.') AS NodesAsExtracted
,   x.vals.value('@DTS:CreationName', 'Varchar(255)') AS CreationName
,   x.vals.value('@DTS:ObjectName', 'Varchar(255)') AS ObjectName
,   y.vals.value('@DTS:ConnectionString', 'Varchar(255)') AS ConnectionString
,   x.vals.value('@DTS:ColumnType', 'Varchar(255)') AS ColumnType
,   x.vals.value('@DTS:MaximumWidth', 'Varchar(255)') AS MaximumWidth
from @XML.nodes('/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn') AS x(vals)
    CROSS APPLY @XML.nodes('/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager') AS y(vals) 

5 Comments

Thanks for that, I did not know you could mix the two. You learn something every day.
Thanks for the enhancement. I tried modifying the last line to CROSS APPLY @XML.nodes('/DTS:Executable/DTS:ConnectionManager/DTS:ObjectData/DTS:ConnectionManager') AS y(vals) but still doesn't give any output. :(
Please poste your new XML
Updated the original post. Thx
Ahh, got it! I had to use the (newly listed) namespace. thanks guys!
0

Of course, you can also use a general-purpose language to convert XML content to tabular format for database import or delimited file export.

Great as it is, SQL is a special-purpose language and not as fluid or dynamic as Java, C#, Python, PHP, Perl, VB, and still others which also carry libraries to run XPath, XSLT, and other XML-specific tasks. Additionally, these languages can connect to any database to retrieve BLOB data.

For future readers, below are open-source examples using the OP's data needs. You will notice use of the positional brackets with [] in xpaths, allowing more DTS:ConnectionManager elements:

Python (using lxml library)

import os
import lxml.etree as ET

cd = os.path.dirname(os.path.abspath(__file__))

xmlfile = 'DTSfile.xml'
dom = ET.parse(os.path.join(cd, xmlfile))
root = dom.getroot()

nodexpath = dom.xpath("//DTS:ConnectionManager", namespaces=root.nsmap)
dataline = []

def checkPath(xpathstr):    
    if dom.xpath(xpathstr, namespaces=root.nsmap) == []:
        return ''
    else:
        return dom.xpath(xpathstr, namespaces=root.nsmap)[0]

for i in range(1,len(nodexpath)+1):
    if i % 2 == 0: continue

    dataline = []    
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/@DTS:CreationName'.format(i)))
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/@DTS:ObjectName'.format(i)))
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/DTS:ObjectData/DTS:ConnectionManager/@DTS:ConnectionString'.format(i)))    
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:MaximumWidth'.format(i)))
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:DataType'.format(i)))
    dataline.append(checkPath('//DTS:ConnectionManager[{0}]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:ObjectName'.format(i)))

    print(dataline)

['FLATFILE', 'FF', 'Test.csv', '', '11', 'TestCN']
['FLATFILE', 'FF2', 'Test2.csv', '', '11', 'TestCN2']

PHP (using simple_xml object)

$cd = dirname(__FILE__);

$xml = simplexml_load_file($cd.'/DTSfile.xml');
$xml->registerXPathNamespace('DTS', 'www.microsoft.com/SqlServer/Dts');

$values = [];
$node = $xml->xpath('//DTS:ConnectionManager');

function checkPath($xml, $xpathstr) {         
     $path = $xml->xpath($xpathstr);
     foreach ($path as $value) {
      if (count($path) > 0) {
           foreach($path as $value) {           
           return $value;         
           }
      }   
      else {
           return '';
      }       
     }         
}
$i = 1;

foreach ($node as $n){         
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/@DTS:CreationName');
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/@DTS:ObjectName');
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/DTS:ObjectData/DTS:ConnectionManager/@DTS:ConnectionString');    
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:MaximumWidth');
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:DataType');
     $values[] = checkPath($xml, '//DTS:ConnectionManager['.$i.']/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:ObjectName');     

     if ($values[1] != "") {
      echo implode(", ", $values)."\n";
     }
     $i++;
     $values = [];         
}

FLATFILE, FF, Test.csv, , 11, TestCN
FLATFILE, FF2, Test2.csv, , 11, TestCN2

R (using xml library)

library(XML) 

setwd("C:\\Path\\To\\Working\\Directory")

doc<-xmlParse("DTSfile.xml")

nodes <- as.list(xpathSApply(doc, '//DTS:ConnectionManager'))

checkPath <- function (xpathstr) {
  if (length(as.list(xpathSApply(doc, xpathstr))) > 0) {
    return(as.list(xpathSApply(doc, xpathstr)))
  } else {
    return("")
  }
}

for (i in (1:length(nodes))) {  
  if (i %% 2) {
    data <- checkPath(sprintf('//DTS:ConnectionManager[%d]/@DTS:CreationName', i))
    data <- append(data, checkPath(sprintf('//DTS:ConnectionManager[%d]/@DTS:ObjectName', i)))
    data <- append(data, checkPath(sprintf('//DTS:ConnectionManager[%d]/DTS:ObjectData/DTS:ConnectionManager/@DTS:ConnectionString', i)))
    data <- append(data, checkPath(sprintf('//DTS:ConnectionManager[%d]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:MaximumWidth', i)))
    data <- append(data, checkPath(sprintf('//DTS:ConnectionManager[%d]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:DataType', i)))
    data <- append(data, checkPath(sprintf('//DTS:ConnectionManager[%d]/DTS:ObjectData/DTS:ConnectionManager/DTS:FlatFileColumns/DTS:FlatFileColumn/@DTS:ObjectName', i)))

    print(paste(data, collapse = ', '))
  }

}

[1] "FLATFILE, FF, Test.csv, , 11, TestCN"
[1] "FLATFILE, FF2, Test2.csv, , 11, TestCN2"

1 Comment

Nice overview! One reason why I prefer pure SQL in many cases is: You can use the same functionality for (naked) reporting and for your application.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.