21

Looking for a bug free tested sql script that i could use in a UDF to encode a url through sql. Function would take in a URL and pass out a URL Encoded URL. I have seen a few, but all i have come across seem to have some flaws.

1

5 Answers 5

15

How about this one by Peter DeBetta:

CREATE FUNCTION dbo.UrlEncode(@url NVARCHAR(1024))
RETURNS NVARCHAR(3072)
AS
BEGIN
    DECLARE @count INT, @c NCHAR(1), @i INT, @urlReturn NVARCHAR(3072)
    SET @count = LEN(@url)
    SET @i = 1
    SET @urlReturn = ''    
    WHILE (@i <= @count)
     BEGIN
        SET @c = SUBSTRING(@url, @i, 1)
        IF @c LIKE N'[A-Za-z0-9()''*\-._!~]' COLLATE Latin1_General_BIN ESCAPE N'\' COLLATE Latin1_General_BIN
         BEGIN
            SET @urlReturn = @urlReturn + @c
         END
        ELSE
         BEGIN
            SET @urlReturn = 
                   @urlReturn + '%'
                   + SUBSTRING(sys.fn_varbintohexstr(CAST(@c AS VARBINARY(MAX))),3,2)
                   + ISNULL(NULLIF(SUBSTRING(sys.fn_varbintohexstr(CAST(@c AS VARBINARY(MAX))),5,2), '00'), '')
         END
        SET @i = @i +1
     END
    RETURN @urlReturn
END
Sign up to request clarification or add additional context in comments.

3 Comments

That was the first one i came to. If you look at the comments below it there were a bunch of different changes and it seemed as though they were just piecing it together as the thread continued. Have you personally had success with this method?
fn_varbintohexstr() is an undocumented function in SQL Server, which means it is not guarantee for future compatibility.
Does not work with some Unicode sets i.e. Devanagari
8

The question specifically asks for a function however here is a solution to url encode if your not able to create any functions:

select replace27
from TableName
cross apply (select replace1 = replace(T.TagText, '%', '%25')) r1
cross apply (select replace2 = replace(replace1, '&', '%26')) r2
cross apply (select replace3 = replace(replace2, '$', '%24')) r3
cross apply (select replace4 = replace(replace3, '+', '%2B')) r4
cross apply (select replace5 = replace(replace4, ',', '%2C')) r5
cross apply (select replace6 = replace(replace5, ':', '%3A')) r6
cross apply (select replace7 = replace(replace6, ';', '%3B')) r7
cross apply (select replace8 = replace(replace7, '=', '%3D')) r8
cross apply (select replace9 = replace(replace8, '?', '%3F')) r9
cross apply (select replace10 = replace(replace9, '@', '%40')) r10
cross apply (select replace11 = replace(replace10, '#', '%23')) r11
cross apply (select replace12 = replace(replace11, '<', '%3C')) r12
cross apply (select replace13 = replace(replace12, '>', '%3E')) r13
cross apply (select replace14 = replace(replace13, '[', '%5B')) r14
cross apply (select replace15 = replace(replace14, ']', '%5D')) r15
cross apply (select replace16 = replace(replace15, '{', '%7B')) r16
cross apply (select replace17 = replace(replace16, '}', '%7D')) r17
cross apply (select replace18 = replace(replace17, '|', '%7C')) r18
cross apply (select replace19 = replace(replace18, '^', '%5E')) r19
cross apply (select replace20 = replace(replace19, ' ', '%20')) r20
cross apply (select replace21 = replace(replace20, '~', '%7E')) r21
cross apply (select replace22 = replace(replace21, '`', '%60')) r22
cross apply (select replace23 = replace(replace22, '*', '%2A')) r23
cross apply (select replace24 = replace(replace23, '(', '%28')) r24
cross apply (select replace25 = replace(replace24, ')', '%29')) r25
cross apply (select replace26 = replace(replace25, '/', '%2F')) r26
cross apply (select replace27 = replace(replace26, '\', '%5C')) r27

The limitation of this solution it that is does not replace ASCII Control characters or Non-ASCII characters.

Note it's important that the first replacement is for % so we don't escape any escape codes.

1 Comment

I believe that ASCII characters can be replaced if you use char(n) as the second parameter. replace(value, char(13), '%---')
5

Daniel Hutmacher from SQL Sunday has provided a nice function.
https://sqlsunday.com/2013/04/07/url-encoding-function/

Note: As indicated by the author at the end of his article, the hex conversion function will only work with non-unicode character values (8-bit characters)

CREATE FUNCTION dbo.fn_char2hex(@char char(1))
RETURNS char(2)
AS BEGIN

    DECLARE @hex char(2), @dec int;
    SET @dec=ASCII(@char);
    SET @hex= --- First hex digit:
             SUBSTRING('0123456789ABCDEF', 1+(@dec-@dec%16)/16, 1)+
              --- Second hex digit:
             SUBSTRING('0123456789ABCDEF', 1+(     @dec%16)   , 1);
    RETURN(@hex);
END

CREATE FUNCTION dbo.fn_UrlEncode(@string varchar(max))
RETURNS varchar(max)
AS BEGIN
    DECLARE @offset int, @char char(1);
    SET @string = REPLACE(@string, '%', '%' + dbo.fn_Char2Hex('%'));
    SET @offset=PATINDEX('%[^A-Z0-9.\-\%]%', @string);
    WHILE (@offset!=0) BEGIN;
        SET @char = SUBSTRING(@string, @offset, 1);
        SET @string = REPLACE(@string, @char, '%' + dbo.fn_Char2Hex(@char));
        SET @offset = PATINDEX('%[^A-Z0-9.\-\%]%', @string);
    END
    RETURN @string;
END;

6 Comments

There's a spelling error in the function name reference in the second CREATE statement that's preventing this solution from working. dbo.fn_Char2hHx should be dbo.fn_Char2Hex.
@Garywoo Thanks for reporting the typo!
This risks breaking with certain collations. For example, the string aa will end up in an infinite loop with Danish/Norwegian collation.
It encodes '-' into '%2D', but '-' is a valid url char.
The allowed url chars are listed here: developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/… I have encountered some problems with 'a' and 'ą' in Polish alphabet. In the end I have ended up with these changes: 1. Entry point variable changed to nvarchar: @string Nvarchar(max) 2. Search pattern: SET @offset=PATINDEX('%[^A-Za-z0-9-_.!~*''()%]%', @@string COLLATE Latin1_General_BIN2);
|
4

Personally I would do this in the application rather than the DB - but if you have to for some reason and you can enable CLR Integration then this would be a perfect candidate for a CLR UDF. It'd be simpler than trying to do it in SQL and probably more reliable and performant too.

There are INET_URIEncode and INET_URIDecode functions in the free version of the SQLsharp T-SQL CLR extension library. It handles Unicode too, though you need the paid version to handle non-standard %uXXYY encoding.

Comments

-11

In order to use this script, you'll want to use Numbers table.

CREATE FUNCTION [dbo].[URLEncode] 
    (@decodedString VARCHAR(4000))
RETURNS VARCHAR(4000)
AS
BEGIN
/******
*       select dbo.URLEncode('K8%/fwO3L mEQ*.}')
**/

DECLARE @encodedString VARCHAR(4000)

IF @decodedString LIKE '%[^a-zA-Z0-9*-.!_]%' ESCAPE '!'
BEGIN
    SELECT @encodedString = REPLACE(
                                    COALESCE(@encodedString, @decodedString),
                                    SUBSTRING(@decodedString,num,1),
                                    '%' + SUBSTRING(master.dbo.fn_varbintohexstr(CONVERT(VARBINARY(1),ASCII(SUBSTRING(@decodedString,num,1)))),3,3))
    FROM dbo.numbers 
    WHERE num BETWEEN 1 AND LEN(@decodedString) AND SUBSTRING(@decodedString,num,1) like '[^a-zA-Z0-9*-.!_]' ESCAPE '!'
END
ELSE
BEGIN
    SELECT @encodedString = @decodedString 
END

RETURN @encodedString

END
GO

The script is fully available on SQL Server Central (registration required)

2 Comments

Have to register on SQL Server Central to view that.
This function didn't work for me with strings that contain multiple characters that need encoding one after the other, I got double encoded strings. Using MSSQL Server 2005.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.