7

I have this query:

SELECT [content_id]
      ,[content_html]
      ,[date_created]
      ,folder_id
  FROM content ct 

where folder_id=126
order by content_title
  FOR XML PATH('PressRelease'), ROOT ('PressReleases')

When I run this query this is the XML file that is generated:

<PressReleases>
  <PressRelease>
    <content_id>6442452927</content_id>
    <content_html>&lt;root&gt;&lt;Date&gt;2015-12-02&lt;/Date&gt;
        &lt;Description&gt;&lt;p class="customHeader"&gt;jobs to Philadelphia.&lt;/p&gt;
        &lt;p&gt;mtext in here.&lt;/p&gt;
        &lt;p&gt;mtext in here.&lt;/p&gt;
        &lt;/Description&gt;
        &lt;SEO&gt;&lt;h1&gt;Pennsylvania Location&lt;/h1&gt;
        &lt;div class="bulletRightBar"&gt; The move was made possible in part by the Philadelphia Jobs 
        Credit&lt;/div&gt;
&lt;/SEO&gt;
&lt;/root&gt;</content_html>
    <date_created>2015-12-02T09:47:12</date_created>
    <folder_id>126</folder_id>
  </PressRelease>
  <PressReleases>

What I need is this XML file:

<PressReleases>
  <PressRelease>
    <content_id>6442452927</content_id>
    <content_html><root><Date>2015-12-02</Date>
        <Description>&lt;p class="customHeader"&gt;jobs to Philadelphia.&lt;/p&gt;
        &lt;p&gt;mtext in here.&lt;/p&gt;
        &lt;p&gt;mtext in here.&lt;/p&gt;
        </Description>
        <SEO>&lt;h1&gt;Pennsylvania Location&lt;/h1&gt;
        &lt;div class="bulletRightBar"&gt; The move was made possible in part by the Philadelphia Jobs 
        Credit&lt;/div&gt;
</SEO>
</root></content_html>
    <date_created>2015-12-02T09:47:12</date_created>
    <folder_id>126</folder_id>
  </PressRelease>
  <PressReleases>

Inside the <content_html> I want to make <root> <date> and <Description> as XML elements but leave the rest as encoded html.

Here's a screenshot for the sql resultenter image description here

7
  • Please state some more details: DataType of "content_html", how is the data within this column before you put it in the XML? Could you please set up a SQL Fiddle or past some realistic sample data? Commented Feb 22, 2016 at 16:37
  • @Shnugo I added a SC for the sql result Commented Feb 22, 2016 at 16:42
  • Looks like you're asking for trouble :) Commented Feb 22, 2016 at 16:53
  • @JoePhilllips Why is that? Commented Feb 22, 2016 at 16:54
  • Did you want to leave </Description> and <SEO> as encoded HTML or convert them as well Commented Feb 22, 2016 at 18:39

4 Answers 4

2

It's not pretty, but you could cast the XML field as a string, use the REPLACE function and cast it back to XML like below. You might want to create a function that does this as the line would get with a lot of replacing:

SELECT [content_id]
      ,cast(REPLACE(cast([content_html] as varchar(max)),'&lt;root&gt;','<root>') as xml)
      ,[date_created]
      ,folder_id
  FROM content ct 

where folder_id=126
order by content_title
  FOR XML PATH('PressRelease'), ROOT ('PressReleases')

or here is a way to call it with a function

CREATE FUNCTION [dbo].[XML_Replace]
        (@XML_Field XML)
RETURNS XML 

BEGIN
 DECLARE @xml varchar(max) 
 DECLARE @xml_Mid varchar(max)
 DECLARE @strtBigInt bigint
 , @endBigInt bigint

 SET @xml = cast(@XML_Field as varchar(max))
 SET @strtBigInt = CHARINDEX('&lt;Description&gt;',@xml)
 SET @endBigInt = CHARINDEX('&lt;/SEO&gt;',@xml)

 SET @xml_Mid = SUBSTRING(@xml, @strtBigInt+19,@endBigInt-@strtBigInt-19);

 RETURN(cast(REPLACE(REPLACE(REPLACE(REPLACE(substring(@xml,0,@strtBigInt+19),'&lt;','<'),'&gt;','>') +    @xml_Mid +  REPLACE(REPLACE(substring(@xml,@endBigInt,Len(@xml)),'&lt;','<'),'&gt;','>'),'&lt;/Description&gt;','</Description>'),'&lt;SEO&gt;','<SEO>') as xml));
END

Then use the funtion in your code:

SELECT [content_id]
      ,dbo.XML_Replace([content_html]) as content_html
      ,[date_created]
      ,folder_id
  FROM content ct 

where folder_id=126
order by content_title
  FOR XML PATH('PressRelease'), ROOT ('PressReleases')
Sign up to request clarification or add additional context in comments.

7 Comments

The first method, you would have to replace each individual tag with the corrected XML tags. The function i just added looks for the beginning and ending Description tags and updates all < and > before and after them respectively
If you don't want your SEO updated to XML tags, you would just need to edit the function a little bit for @endBigInt to look for the SEO closing tag and then do a separate Replace on the @xml_Mid string to replace &lt;/Description&gt; with </Description>
The function didn't work for you? What is wrong with the output from what you expect?
it converts decode everything for me to nodes , which is not what i want ... there is some html tags that i dont want to encode
'XML_Replace' is not a recognized built-in function name.
|
2
+25

It looks like [content_html] is stored as string (perhaps nvarchar?). It's contents looks like a well-formed XML. If that is the case it can be CAST to XML to get properly incorporated into result set:

-- Test Data Set
WITH content AS (
    SELECT
        6442452927 AS [content_id],
        N'<root><Date>2015-12-02</Date>
        <Description><p class="customHeader">jobs to Philadelphia.</p>
        <p>mtext in here.</p>
        <p>mtext in here.</p>
        </Description>
        <SEO><h1>Pennsylvania Location</h1>
        <div class="bulletRightBar"> The move was made possible in part by the Philadelphia Jobs 
        Credit</div>
</SEO>
</root>' AS [content_html],
        CAST('2015-12-02T09:47:12' AS DATETIME2) AS [date_created],
        126 AS [folder_id],
        1 AS [content_title]
)
-- Query
SELECT [content_id]
        ,CAST((Select 
          CAST([content_html] AS XML).query('/root/Date/node()') AS [Date]
          ,CAST(CAST([content_html] AS XML).query('/root/Description/node()') AS nvarchar(max)) As [Description]
          ,CAST(CAST([content_html] AS XML).query('/root/SEO/node()') AS nvarchar(max)) AS [SEO]
          FOR XML PATH('root'), ROOT ('content_html')) As XML)
      ,[date_created]
      ,[folder_id]
  FROM content ct
WHERE folder_id=126
ORDER BY content_title
FOR XML PATH('PressRelease'), ROOT ('PressReleases')

2 Comments

That will decode everything inside my query to HTML tags (<>) which not what I want
@Alundrathedreamwalker That can be fixed with some more CASTs.
2

The tricky thing about this is that your content_html field contains neither XML or HTML. It's a very strange way to encode the data. If possible you might want to convert the data in the field so that it is XML, which would make this query very easy. For example, you would convert the field in the data you provided to:

<root>
  <Date>2015-12-02</Date>
  <Description>&lt;p class="customHeader"&gt;jobs to Philadelphia.&lt;/p&gt;
        &lt;p&gt;mtext in here.&lt;/p&gt;
        &lt;p&gt;mtext in here.&lt;/p&gt;
        </Description>
  <SEO>&lt;h1&gt;Pennsylvania Location&lt;/h1&gt;
        &lt;div class="bulletRightBar"&gt; The move was made possible in part by the Philadelphia Jobs 
        Credit&lt;/div&gt;
</SEO>
</root>

If you can't change the way the data is stored, one way to get at it is to do some converting back and forth using the code below. The sub-query creates a field called [content_xml] that converts your data in the content_html field to an XML data type. This assumes that all of your data can be converted to valid XML. Then for each node under root in the XML, it rebuilds the node and wraps the data with CDATA so that it preserves the formatting you desire.

--set up test data
create table #content
    ([content_id] bigint
    ,[content_html] varchar(max)
    ,[date_created] datetime
    ,folder_id int
    , content_title varchar(50)
    )

insert into #content
values (
6442452927, 
'&lt;root&gt;&lt;Date&gt;2015-12-02&lt;/Date&gt;
        &lt;Description&gt;&lt;p class="customHeader"&gt;jobs to Philadelphia.&lt;/p&gt;
        &lt;p&gt;mtext in here.&lt;/p&gt;
        &lt;p&gt;mtext in here.&lt;/p&gt;
        &lt;/Description&gt;
        &lt;SEO&gt;&lt;h1&gt;Pennsylvania Location&lt;/h1&gt;
        &lt;div class="bulletRightBar"&gt; The move was made possible in part by the Philadelphia Jobs 
        Credit&lt;/div&gt;
&lt;/SEO&gt;
&lt;/root&gt;', 
'2015-12-02T09:47:12', 
126, 
'Content Title')

--Output
select [content_id],
    (
    SELECT
        CONVERT(xml,
        '<' + convert(varchar(max),T.c.query('local-name(.)')) + '>' 
        + (select convert(xml, '<![CDATA[' + convert(varchar(max),T.c.query('node()')) + ']]>') for xml path(''))
        + '</' + convert(varchar(max),T.c.query('local-name(.)')) + '>'
        )
    FROM content_xml.nodes('/root/*') T(c)
    for xml path(''), type
    ) as [content_html/root]
    ,[date_created]
    ,folder_id
from
    (
    SELECT *, 
        convert(xml,convert(xml,content_html).value('.','varchar(max)')) [content_xml]
    FROM #content ct 
    ) AllData
order by content_title
FOR XML PATH('PressRelease'), ROOT ('PressReleases')

1 Comment

Did this work for you? It does what you said you needed in your test case.
1

Plz try this one

SELECT [content_id]
      ,CONVERT(XML,REPLACE(REPLACE([content_html],'&lt;','<'),'&gt;','>')) AS [content_html]
      ,[date_created]
      ,folder_id
FROM content ct 
WHERE folder_id=126
ORDER BY content_title
FOR XML PATH('PressRelease'), ROOT ('PressReleases')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.