1

I'd like some help in optimizing the following query:

SELECT DISTINCT TOP (@NumberOfResultsRequested) dbo.FilterRecentSearchesTitles(OriginalSearchTerm) AS SearchTerms
FROM UserSearches
WHERE WebsiteID = @WebsiteID
AND LEN(OriginalSearchTerm) > 20
--AND dbo.FilterRecentSearchesTitles(OriginalSearchTerm) NOT IN (SELECT KeywordUrl FROM PopularSearchesBaseline WHERE WebsiteID = @WebsiteID)
GROUP BY OriginalSearchTerm, GeoID

It runs fine without the line that is commented out. I have an index set on UserSearches.OriginalSearchTerm, WebsiteID, and PopularSearchesBaseline.KeywordUrl, but the query still runs slow with this line in there.

-- UPDATE -- The function used is as follows:

 ALTER FUNCTION [dbo].[FilterRecentSearchesTitles]
(
    @SearchTerm VARCHAR(512)
)

RETURNS VARCHAR(512)

AS
BEGIN
    DECLARE @Ret VARCHAR(512)

    SET @Ret = dbo.RegexReplace('[0-9]', '', REPLACE(@SearchTerm, '__s', ''), 1, 1)
    SET @Ret = dbo.RegexReplace('\.', '', @Ret, 1, 1)
    SET @Ret = dbo.RegexReplace('\s{2,}', ' ', @Ret, 1, 1)
    SET @Ret = dbo.RegexReplace('\sv\s', ' ', @Ret, 1, 1)

    RETURN(@Ret)
END

Using the Reglar Expression Workbench code.

However, as I mentioned - without the line that is currently commented out it runs fine.

Any other suggestions?

1
  • 1
    Remove the function. Or show us what the function does. If it does data access for every line in the source query, I think we found your problem. Consider re-writing it as a table-valued function, then SQL Server has some chance at optimizing it. Commented Jul 25, 2012 at 19:38

3 Answers 3

1

I am going to guess that dbo.FilterRecentSearchesTitles(OriginalSearchTerm) is a function. My suggestion would be to see about rewriting it into a table valued function so you can return a table that could be joined on.

Otherwise you are calling that function for each row you are trying to return which is going to cause your problems.

If you cannot rewrite the function, then why not create a stored proc that will only execute it once, similar to this:

SELECT DISTINCT TOP (@NumberOfResultsRequested) dbo.FilterRecentSearchesTitles(OriginalSearchTerm) AS SearchTerms
INTO #temp
WHERE WebsiteID = @WebsiteID


SELECT *
FROM #temp
WHERE SearchTerms NOT IN (SELECT KeywordUrl 
                            FROM PopularSearchesBaseline 
                            WHERE WebsiteID = @WebsiteID)

Then you get your records into a temp table after executing the function once and then you select on the temp table.

Sign up to request clarification or add additional context in comments.

1 Comment

Based on your suggestion of the function being called on every function, I modified my row insert to include this function, adding the content to a new column. That way, I can avoid having to do this later for every row on the SELECT. +1 for putting me on a better track.
1

I might try to use a persisted computed column in this case:

ALTER TABLE UserSearches ADD FilteredOriginalSearchTerm AS dbo.FilterRecentSearchesTitles(OriginalSearchTerm) PERSISTED

You will probably have to add WITH SCHEMABINDING to your function (and the RegexReplace function) like so:

ALTER FUNCTION [dbo].[FilterRecentSearchesTitles]
(
    @SearchTerm VARCHAR(512)
)

RETURNS VARCHAR(512)

WITH SCHEMABINDING -- You will need this so the function is considered deterministic

AS
BEGIN
    DECLARE @Ret VARCHAR(512)

    SET @Ret = dbo.RegexReplace('[0-9]', '', REPLACE(@SearchTerm, '__s', ''), 1, 1)
    SET @Ret = dbo.RegexReplace('\.', '', @Ret, 1, 1)
    SET @Ret = dbo.RegexReplace('\s{2,}', ' ', @Ret, 1, 1)
    SET @Ret = dbo.RegexReplace('\sv\s', ' ', @Ret, 1, 1)

    RETURN(@Ret)
END

This makes your query look like this:

SELECT DISTINCT TOP (@NumberOfResultsRequested) FilteredOriginalSearchTerm AS SearchTerms
FROM UserSearches
WHERE WebsiteID = @WebsiteID
AND LEN(OriginalSearchTerm) > 20
AND FilteredOriginalSearchTerm NOT IN (SELECT KeywordUrl FROM PopularSearchesBaseline WHERE WebsiteID = @WebsiteID)
GROUP BY OriginalSearchTerm, GeoID

Which could potentially be optimized for speed (if necessary) with a join instead of not in, or maybe different indexing (perhaps on the computed column, or some covering indexes). Also, DISTINCT with a GROUP BY is somewhat of a code smell to me, but it could be legit.

1 Comment

Yeah, you're right about the DISTINCT and GROUP BY... they're just artifacts of different iterations. Thanks for the suggestions.
0

Instead of using using the function on SELECT, I modified the INSERT query to include this function. That way, I avoid calling the function for every row when I later want to retrieve the data.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.