2

From a prior question, this SQL function extracts strings following a set string.

so,

extractAfter("hello this is a Test:12356.jpg reset of line","Test:")

returns: 12356.jpg

It works, but I need to tweak it to consider two other things:

  • in addition to whitespaces the string should end when "<" is encounter
  • that if no match, I'd like to return null

so,

extractAfter("hello this is a Test:12356.jpg<br> reset of line","Test:")

would also return:12356.jpg

create function dbo.extractAfter(@full nvarchar(max), @part nvarchar(max)) 
returns nvarchar(max) with returns null on null input as 
begin 
    return ltrim(stuff(left(@full,charindex(' ', @full + ' ', charindex(@part,@full)+1)),     1, charindex(@part,@full)+datalength(@part)/2 -1, '')) 
end 
go 

3 Answers 3

2
CREATE FUNCTION dbo.extractAfter (@full nvarchar(max), @part nvarchar(max))
RETURNS nvarchar(max)
WITH RETURNS NULL ON NULL INPUT
AS BEGIN
  RETURN (
    SELECT LEFT(p, PATINDEX('%[ <]%', p + ' ') - 1)
    FROM (SELECT
      p = STUFF(@full, 1, NULLIF(CHARINDEX(@part, @full), 0) + LEN(@part) - 1, '')
    ) s
  )
END
Sign up to request clarification or add additional context in comments.

1 Comment

This is a very useful answer
2

Rather than try to keep tweaking a string parsing function when T-SQL really does not specialize in text-parsing, I would recommend using Regular Expressions if you are on SQL Server 2005 or newer. You can find many examples on the internet of the exact SQLCLR code for these, or you can download the Free version of the SQL# library (which I wrote) and start using them right away. Here are some examples using your specific situation:

SELECT SQL#.RegEx_CaptureGroup('hello this is a Test:12356.jpg<br> reset of line',  
           'Test:([^ ]+)<br>', 1, NULL, 1, -1, '')
-- 12356.jpg

SELECT SQL#.RegEx_CaptureGroup('hello this is a Test:<br> reset of line',  
           'Test:([^ ]+)<br>', 1, NULL, 1, -1, '')
-- NULL

SELECT SQL#.RegEx_CaptureGroup('hello this is a T:12356.jpg<br> reset of line',  
           'Test:([^ ]+)<br>', 1, NULL, 1, -1, '')
-- NULL

In each case, the pattern match starts with the word "Test:" then grabs any non-space characters until it reaches a "<br>" (or a space since it is looking for non-space). If not found due to nothing being between the "Test:" and the "<br>" or if the "Test:" isn't there to begin with, it returns NULL.

1 Comment

Thank you for your solution. I agree I should use CLR. It's going to be more powerful for sure. I'm saving this to go back to it. right now our DBA is on vacation I believe I would need added rights on the server to set this up.
0
Declare @TestString varchar(max)
Declare @TestSearch varchar(max)

Set @TestString = 'hello this is a Test:12356.jpg<br> reset of line'
Set @TestSearch = 'Test:'

Select Case
        When CharIndex( @TestSearch, @TestSTring ) = 0 Then ''
        When Left(Reverse( Z.Value ), Len(Z.Value)) = Reverse( Z.Value ) Then Z.Value
        Else Substring( Value, 1, PatIndex( '%[<> ]%', Value ) - 1 )
        End
From ( Select Substring( @TestString
                , CharIndex( @TestSearch, @TestSTring ) + Len(@TestSearch)
                , Len(@TestString )
    ) As Value ) As Z

I have slightly modified my solution to account for the scenario where the search string happens to be at the end of the input string.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.