0

I have created a T-SQL function that is able to extract dates from strings.

The function works when the date has no ..

Please help get the missing piece to make this function work with dates that have periods.

Dates with periods come in 2 different flavors:

MM.DD.YYYY 
M.DD.YY

My function code:

IF OBJECT_ID(N'tempdb..#fileNameTable', N'U') IS NOT NULL 
    DROP TABLE #fileNameTable

CREATE TABLE #fileNameTable
(
    [fName] [VARCHAR](250) NULL
)

INSERT INTO #fileNameTable (fName) 
    SELECT '9999999991_Agent Name_08.02.2018.WAV'

INSERT INTO #fileNameTable (fName) 
    SELECT '9999999999 - Internal ID 1446683 (Pedro) 6.26.17 WB.mp3'

INSERT INTO #fileNameTable (fName) 
    SELECT '9999999998 - Internal ID 1464807 (John) 7.11.17.mp3'

INSERT INTO #fileNameTable (fName) 
    SELECT '9999999997 - Internal ID 1447503 (Marta) 6.27.17.mp3'

INSERT INTO #fileNameTable (fName) 
    SELECT '9999999996 - Internal ID 1437403 (Ruby) 6.20.17.mp3'

INSERT INTO #fileNameTable (fName) 
    SELECT 'rc_20200817_1612_9999999995_NJ.wav'

INSERT INTO #fileNameTable (fName) 
    SELECT 'rc_20200817_1543_9999999994_PA.wav'

INSERT INTO #fileNameTable (fName) 
    SELECT 'rc_20200817_1211_9999999993_MA.wav'

INSERT INTO #fileNameTable (fName) 
    SELECT 'rc_20200817_1211_9999999992_MD.wav'


SELECT *, [dbo].[ExtractDateFromFileName](fName) 
FROM #fileNameTable

CREATE FUNCTION [dbo].[ExtractDateFromFileName]
    (@str VARCHAR(256))
RETURNS VARCHAR(256)
AS
BEGIN
    DECLARE @validchars VARCHAR(MAX)
    SET @validchars = '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'

    DECLARE @idx INT
    SET @idx = PATINDEX('%'+ @validchars +'%',@str)

    IF @idx > 0 AND (@idx = LEN(@str) - 8
                     OR PATINDEX(SUBSTRING(@str, @idx + 9, 1), '[0-9]') = 0)
        SET @str = SUBSTRING(@str, PATINDEX('%' + @validchars +'%', @str), 8)
    ELSE 
        SET @str = ''

    RETURN @str
END
5
  • 1
    What are the expected results for your sample data? Commented Feb 28, 2022 at 21:11
  • 1
    I would strongly recommend writing this as a single-statement table-valued function, it will be orders of magnitude faster, especially when run against a large data set. Commented Feb 28, 2022 at 21:12
  • 1
    The function works when the date has no "." Try inserting . characters into your matching pattern. Your next problem is that your dates aren't formatted consistently - some use the mm-dd-yyyy pattern, others seem to be m-d-yy pattern, still others use the yyyymmdd pattern. SQL Server is really bad/inefficient for data extraction like this, you'd be better served using an external scripting language that's fit-for-purpose like Perl, NodeJS, Python, PowerShell etc., all of which have regex support. Commented Feb 28, 2022 at 22:03
  • The expected result is to identify the date inside the string. Commented Feb 28, 2022 at 22:16
  • @InternetEngineer What they mean is to show the actual results you want (it's actually a pretty common request - see sqlblog.org/ask). Commented Feb 28, 2022 at 22:56

2 Answers 2

3

Another approach:

CREATE FUNCTION dbo.ExtractDateFromGarbage ( @garbage varchar(250))
RETURNS TABLE WITH SCHEMABINDING
AS
  RETURN 
  (
    SELECT theDate = TRY_CONVERT(date, 
        x + REPLACE(LEFT(y, PATINDEX('%[^0-9]%', y+'.')-1),'.','/'))
      FROM (SELECT g = @garbage) AS f 
    CROSS APPLY (VALUES (PATINDEX('%[01 ][0-9].%[0-9].[0-9][0-9]%', g),
       PATINDEX('%2[0-9][0-9][0-9][0-1][0-9][0-3][0-9]%', g))) AS v(s,t) 
    CROSS APPLY (VALUES (SUBSTRING(g, COALESCE(NULLIF(s,0),t), 250))) AS q(q) 
    CROSS APPLY (VALUES(LEFT(q,6), SUBSTRING(q,7,250))) AS z(x,y)
  );

It handles one edge case Stu's doesn't (mm.d.yy):

The moral of the story is still that you should be doing this somewhere else (and storing proper dates in the right data type once the data makes its way to SQL Server).

Sign up to request clarification or add additional context in comments.

2 Comments

Nice enhancement, I like your subliminal messaging.
@Stu haha, that wasn't very subliminal. :-)
2

I would suggest you implement this as a table-valued function, I had a go at rewriting your function to work for both types of dates in your sample data. It may still require some tweaking for edge-cases but should be mostly there.

Testing with 1.3 million rows (of your sample data, duplicated with a cartesian join) on my desktop PC, your current scalar-valued function has the following statistics time data:

SQL Server Execution Times: CPU time = 19886 ms, elapsed time = 19961 ms.

With the function below the same 1.3 million rows has the following times:

SQL Server Execution Times: CPU time = 7122 ms, elapsed time = 7296 ms.

The table-valued function is much faster because SQL Server is able to inline it into the query, rather than running a separate function stand-alone per row, ie, RBAR.

create or alter function [dbo].[ExtractDateFromFileName2](@str varchar(256))
returns table
as return
select coalesce(
    Try_Convert(date,Substring(string, d1.v,8),111), 
    Try_Convert(date,Substring(string, d3.v,10),104),
    Try_Convert(date,Substring(string, d2.v,7),1)
    ) [Value]
from (select @str string)s
outer apply (values(PatIndex('%[1-3][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%',string)))d1(v)
outer apply (values(PatIndex('%[0-9].[0-9][0-9].[0-9][0-9]%',string)))d2(v)
outer apply (values(PatIndex('%[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9][0-9]%',string)))d3(v);

See demo Fiddle

3 Comments

Nice query, just this is not work for 9999999991_Agent Name_08.02.2018.WAV because you need to use this pattern [0-9][0-9].[0-9][0-9].[0-9][0-9][0-9][0-9] and also for convert it use Try_Convert(date,Substring(string, d1.v,10),104)
@SaeedEsmaeelinejad Yes I just noticed the 1 or 2 digits. Working on a tweak :)
@SaeedEsmaeelinejad Decided to add a 3rd check and coalesce the results, appears to work fine for the sample data and a few other dates. Hopefully the OP can make use of it and cater for any other edge cases.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.