0

I have the following text field in SQL Server table:

1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0
  1. Would like to retrieve only the part before the exclamation mark (!). So for 1!1 I only want 1, for 3!0 I only want 3, for 23!0 I only want 23.

  2. Would also like to retrieve only the part after the exclamation mark (!). So for 1!1 I only want 1, for 3!0 I only want 0, for 23!0 I only want 0.

Both point 1 and point 2 should be inserted into separate columns of a SQL Server table.

4
  • 3
    You shouldn't be storing delimited values in a single column in the first place. Commented Jan 10, 2013 at 15:14
  • 2
    Is that entire string a single record, or is 1!1 a record, 3!0 another record, and so on? Commented Jan 10, 2013 at 15:15
  • 1
    Please, spend the time to normalize this. Commented Jan 10, 2013 at 15:15
  • I have a question: Do people also use the wrong end of the hammer to hit the nails and wonder why it is inefficient? Or is it just the DB topic that brings out this phenomenon? Commented Jan 10, 2013 at 15:17

3 Answers 3

1

I LOVE SQL Server's XML capabilities. It is a great way to parse data. Try this one out:

--Load the original string
DECLARE @string nvarchar(max) = '1!2,3!4,5!6,7!8,9!10';

--Turn it into XML
SET @string = REPLACE(@string,',','</SecondNumber></Pair><Pair><FirstNumber>') + '</SecondNumber></Pair>';
SET @string = '<Pair><FirstNumber>' + REPLACE(@string,'!','</FirstNumber><SecondNumber>');

--Show the new version of the string
SELECT @string AS XmlIfiedString;

--Load it into an XML variable
DECLARE @xml XML = @string;

--Now, First and Second Number from each pair...
SELECT
  Pairs.Pair.value('FirstNumber[1]','nvarchar(1024)') AS FirstNumber,
  Pairs.Pair.value('SecondNumber[1]','nvarchar(1024)') AS SecondNumber
FROM @xml.nodes('//*:Pair') Pairs(Pair);

The above query turned the string into XML like this:

<Pair><FirstNumber>1</FirstNumber><SecondNumber>2</SecondNumber></Pair> ...

Then parsed it to return a result like:

FirstNumber | SecondNumber
----------- | ------------
          1 |            2
          3 |            4
          5 |            6
          7 |            8
          9 |           10
Sign up to request clarification or add additional context in comments.

Comments

0

I completely agree with the guys complaining about this sort of data. The fact however, is that we often don't have any control of the format of our sources.

Here's my approach...

First you need a tokeniser. This one is very efficient (probably the fastest non-CLR). Found at http://www.sqlservercentral.com/articles/Tally+Table/72993/

CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
        (@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE!  IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
     -- enough to cover VARCHAR(8000)
  WITH E1(N) AS (
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
                ),                          --10E+1 or 10 rows
       E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
       E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
 cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
                     -- for both a performance gain and prevention of accidental "overruns"
                 SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
                ),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
                 SELECT 1 UNION ALL
                 SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
                ),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
                 SELECT s.N1,
                        ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
                   FROM cteStart s
                )
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
 SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
        Item       = SUBSTRING(@pString, l.N1, l.L1)
   FROM cteLen l
;
GO

Then you consume it like so...

DECLARE @Wtf VARCHAR(1000) = '1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0'

SELECT   LEFT(Item, CHARINDEX('!', Item)-1)
        ,RIGHT(Item, CHARINDEX('!', REVERSE(Item))-1)
FROM [dbo].[DelimitedSplit8K](@Wtf, ',')

The function posted and logic for parsing can be integrated in to a single function of course.

Comments

0

I agree to normaliz the data is the best way. However, here is the XML solution to parse the data

DECLARE @str VARCHAR(1000) = '1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0'
    ,@xml XML

SET @xml  = CAST('<row><col>' + REPLACE(REPLACE(@str,'!','</col><col>'),',','</col></row><row><col>') + '</col></row>' AS XML)

SELECT  
     line.col.value('col[1]', 'varchar(1000)') AS col1
    ,line.col.value('col[2]', 'varchar(1000)') AS col2
FROM    @xml.nodes('/row') AS line(col)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.