0

I have a user-defined function in Microsoft SQL Server, and I have a problem getting it to work correctly.

The function takes in one NVARCHAR(MAX) parameter that will then strip all the HTML tags from that string text and then remove all the extra white spaces via a cursor

CREATE FUNCTION UDF_STRIP_HTML 
     (@HTMLText NVARCHAR(MAX))
RETURNS NVARCHAR(MAX) AS
BEGIN
    DECLARE @Start INT
    DECLARE @End INT
    DECLARE @Length INT

    SET @Start = CHARINDEX('<',@HTMLText)
    SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
    SET @Length = (@End - @Start) + 1

    WHILE @Start > 0 AND @End > 0 AND @Length > 0
    BEGIN
        SET @HTMLText = STUFF(@HTMLText, @Start, @Length, '')
        SET @Start = CHARINDEX('<', @HTMLText)
        SET @End = CHARINDEX('>', @HTMLText, CHARINDEX('<', @HTMLText))
        SET @Length = (@End - @Start) + 1
    END

    DECLARE @WORD VARCHAR(MAX)
    DECLARE @STRING VARCHAR(MAX)
    DECLARE AUDIT_TRAIL_CURSOR CURSOR 
    FOR 
        --Split string into individual words and ignore blanks or extra spaces
        SELECT VALUE 
        FROM STRING_SPLIT(@HTMLText, ' ') 
        WHERE LTRIM(RTRIM(VALUE)) <> ''

    OPEN AUDIT_TRAIL_CURSOR

    FETCH NEXT FROM AUDIT_TRAIL_CURSOR INTO @WORD

    WHILE @@FETCH_STATUS = 0
    BEGIN 
        --Strip extra spaces from each word and add only one space after
        SET @STRING = CONCAT(@STRING, RTRIM(LTRIM(@WORD)), ' ')

        FETCH NEXT FROM AUDIT_TRAIL_CURSOR INTO @WORD
    END

    CLOSE AUDIT_TRAIL_CURSOR
    DEALLOCATE AUDIT_TRAIL_CURSOR

    RETURN @STRING
END
GO

The HTML stripper works fine either way, but the whitespace stripper only works if I directly pass in a hard coded string like in

This works:

SELECT dbo.UDF_STRIP_HTML('Some <b/>      string   with        Words.   ') 

HTML stripper works, but whitespace stripper does not:

SELECT dbo.UDF_STRIP_HTML(some_column) 
FROM some_table 

NOTE: I understand this may not be the best function, but this will be for a one time data export via query

2
  • 1
    What is your SQL Server version? Commented Feb 5, 2021 at 19:05
  • It was stripping out the HTML stuff, but was not getting rid of the whitespace when referencing a column SELECT dbo.UDF_STRIP_HTML('Hard coded <b/> string works') => Returns 'Hard coded string works' but SELECT dbo.UDF_STRIP_HTML(column) FROM table where column is ' The Hard coded <b/> string works' => returns 'Hard coded string works' The whitespace is still there. It should also return 'Hard coded string works' @ Yitzhak Khabinsky gave me a solution that works correctly Commented Feb 5, 2021 at 19:43

1 Answer 1

1

XML has many useful data types. One of them: token is very handy for your scenario.

Here is what it does:

  1. All invisible TAB, Carriage Return, and Line Feed characters will be replaced with spaces.
  2. Then leading and trailing spaces are removed from the value.
  3. Further, contiguous occurrences of more than one space will be replaced with a single space.

You can replace this entire chunk:

DECLARE @WORD VARCHAR(MAX)
    DECLARE @STRING VARCHAR(MAX)
    DECLARE AUDIT_TRAIL_CURSOR CURSOR 
    FOR 
    --Split string into induvidual words and ignore blanks or extra spaces
    SELECT VALUE FROM STRING_SPLIT(@HTMLText, ' ') WHERE LTRIM(RTRIM(VALUE)) <> ''
    OPEN AUDIT_TRAIL_CURSOR
    FETCH NEXT FROM AUDIT_TRAIL_CURSOR INTO @WORD
    WHILE @@FETCH_STATUS = 0
    BEGIN 
        --Strip extra spaces from each word and add only one space after
        SET @STRING = CONCAT(@STRING,RTRIM(LTRIM(@WORD)),' ')
        FETCH NEXT FROM AUDIT_TRAIL_CURSOR INTO @WORD
    END
    CLOSE AUDIT_TRAIL_CURSOR
    DEALLOCATE AUDIT_TRAIL_CURSOR
    RETURN @STRING

With the following:

RETURN TRY_CAST('<r><![CDATA[' + @HTMLText + ']]></r>' AS XML).value('(/r/text())[1] cast as xs:token?','NVARCHAR(MAX)');
Sign up to request clarification or add additional context in comments.

1 Comment

@Walt, glad to hear that the proposed solution is working for you. Please up-vote my suggestion: feedback.azure.com/forums/908035-sql-server/suggestions/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.