Delete nodes from XML file in SQL Server

Question

I have an XML file, exported from Excel, that is stored in an XML column in a table in SQL Server. The Excel file had many worksheets, and I only want to store a few of them. Is there a way to delete nodes that don't have the name I want to keep, similar to NOT IN?

The XML file has this header:

<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">

This is the basic structure of the XML:

<Workbook>
  <DocumentProperties>
  </DocumentProperties>
  <ExcelWorkbook>
  </ExcelWorkbook>
  <Styles>
    <Style>
    </Style>
  </Styles>
  <Worksheet ss:Name="Worksheet1">
    <Table>
      <Column.../>
      <Column.../>
      <Column.../>
      <Row>
        <Cell.../>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
        <Cell.../>
      </Row>
      ...
    </Table>
  </Worksheet>

and the node is repeated several times. Let's say I only want to keep "Worksheet3","Worksheet4", and "Worksheet8". How could I do that? I think I would want to use update and modify() in this case, similar to this question. The only difference is that I want to keep certain values, and delete the rest.

Gottfried Lesigang · Accepted Answer · 2015-10-23 16:05:40Z

2

Try it like this:

DECLARE @xml XML=
'<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
  <DocumentProperties>
  </DocumentProperties>
  <ExcelWorkbook>
  </ExcelWorkbook>
  <Styles>
    <Style>
    </Style>
  </Styles>
  <Worksheet ss:Name="Worksheet1">
    <Table>
      <Column/>
      <Column/>
      <Column/>
      <Row>
        <Cell.../>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
      </Row>
    </Table>
  </Worksheet>
  <Worksheet ss:Name="Worksheet2">
    <Table>
      <Column/>
      <Column/>
      <Column/>
      <Row>
        <Cell.../>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
      </Row>
    </Table>
  </Worksheet>
  <Worksheet ss:Name="Worksheet3">
    <Table>
      <Column/>
      <Column/>
      <Column/>
      <Row>
        <Cell.../>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
        <Cell><Data>...</Data></Cell>
      </Row>
    </Table>
  </Worksheet>
</Workbook>';

SET @xml.modify('declare namespace dflt="urn:schemas-microsoft-com:office:spreadsheet";
                 declare namespace ss="urn:schemas-microsoft-com:office:spreadsheet"; 
                 delete /dflt:Workbook/dflt:Worksheet[@ss:Name="Worksheet2"]');

SELECT @xml;

You can combine many filter expressions with "or"...

edited Oct 23, 2015 at 16:05

answered Oct 23, 2015 at 15:33

Gottfried Lesigang

67.6k9 gold badges60 silver badges124 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Sam CD Over a year ago

The issue here is that I am actually trying to reduce storage space by reducing the size of the XML file, so I actually want to delete those nodes

Sam CD Over a year ago

Yes, something like that, but I am wondering if XQuery has any sort of not in clause, so I can delete /dflt:Workbook/dflt:Worksheet[@ss:Name not in "Worksheet2","Worksheet3","Worksheet4"]

Gottfried Lesigang Over a year ago

@Samcd, with this /dflt:Workbook/dflt:Worksheet[@ss:Name!="Worksheet1" and @ss:Name!="Worksheet3"] you'd keep 1 and 3. There's no "in" clause... Please vote up and mark as accepted if this solves your problem, thx!

Gottfried Lesigang Over a year ago

@Samcd, I'm curious... Could this solve your problem?

Sam CD Over a year ago

Yes this got it to work! Thanks. I edited your comments into the answer.

Collectives™ on Stack Overflow

Delete nodes from XML file in SQL Server

1 Answer 1

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related