1

I'm trying to optimize or completely rewrite this query. It takes about ~1500ms to run currently. I know the distinct's are fairly inefficient as well as the Union. But I'm struggling to figure out exactly where to go from here.

I am thinking that the first select statement might not be needed to return the output of;

[Key | User_ID,(User_ID)]

Note; Program and Program Scenario are both using Clustered Indexes. I can provide a screenshot of the Execution Plan if needed.

ALTER FUNCTION [dbo].[Fn_Get_Del_User_ID] (@_CompKey INT)

RETURNS VARCHAR(8000)

AS

BEGIN

    DECLARE @UseID AS VARCHAR(8000);

    SET @UseID = '';

    SELECT @UseID = @UseID + ', ' + x.User_ID
    FROM 
        (SELECT DISTINCT (UPPER(p.User_ID)) as User_ID FROM  [dbo].[Program] AS p WITH (NOLOCK) 
        WHERE p.CompKey = @_CompKey
        UNION
        SELECT DISTINCT (UPPER(ps.User_ID)) as User_ID FROM  [dbo].[Program] AS p WITH (NOLOCK) 
        LEFT OUTER JOIN [dbo].[Program_Scenario] AS ps WITH (NOLOCK) ON p.ProgKey = ps.ProgKey
        WHERE p.CompKey = @_CompKey
        AND ps.User_ID IS NOT NULL) x 

    RETURN Substring(@UserIDs, 3, 8000);
END
10
  • If you use UNION there is no need to select DISTINCT as well. The de-duplication of UNION works on the entire unioned set. You should probably also need an INNER JOIN rather than a LEFT JOIN. With AND ps.User_ID IS NOT NULL it already is an inner join. (This won't help performance though) Commented Jan 24, 2018 at 17:00
  • Thanks FrankerZ, also thanks for your input Honey. I'll test it out now! Commented Jan 24, 2018 at 17:00
  • So removing both DISTINCT and changing the join to an INNER JOIN actually increased the run-time by ~100ms. I definitely understand where you're headed with it though. I'm not interested in Null Values. I do find however that if I insert UNION ALL it decreases the runtime, but alters my resulting output. Commented Jan 24, 2018 at 17:11
  • It may be a better idea to just select the User_ID's and return a table. You can do the concatenation somewhere else if you want. What's the purpose of it anyway? It is not generally advisable to handle comma separated values in SQL, atomic values are much easier to handle. Commented Jan 24, 2018 at 17:15
  • There are better ways of generating a comma delimited list than this. You can use STUFF with FOR XML. And then you could turn this scalar function into an inline table valued function. And get rid of those NOLOCK hints. They are not a performance tool that have no baggage. They can cause duplicate and/or missing rows. If you are ok with your data being mostly correct most of the time it is mostly ok. blogs.sqlsentry.com/aaronbertrand/bad-habits-nolock-everywhere Commented Jan 24, 2018 at 17:17

2 Answers 2

3

There are two things happening in this query

1. Locating rows in the [Program] table matching the specified CompKey (@_CompKey)

2. Locating rows in the [Program_Scenario] table that have the same ProgKey as the rows located in (1) above.

Finally, non-null UserIDs from both these sets of rows are concatenated into a scalar.

For step 1 to be efficient, you'd need an index on the CompKey column (clustered or non-clustered) For step 2 to be efficient, you'd need an index on the join key which is ProgKey on the Program_Scenario table (this likely is a non-clustered index as I can't imagine ProgKey to be PK). Likely, SQL would resort to a loop join strategy - i.e., for each row found in [Program] matching the CompKey criteria, it would need to lookup corresponding rows in [Program_Scenario] with same ProgKey. This is a guess though, as there is not sufficient information on the cardinality and distribution of data.

Ensure the above two indexes are present.

Also, as others have noted the second left outer join is a bit confusing as an inner join is the right way to deal with it.

Per my interpretation the inner part of the query can be rewritten this way. Also, this is the query you'd ideally run and optimize before tacking the string concatenation part. The DISTINCT is dropped as it is automatic with a UNION. Try this version of the query along with the indexes above and if it provides the necessary boost, then include the string concatenation or the xml STUFF approaches to return a scalar.

SELECT UPPER(p.User_ID) as User_ID 
FROM 
   [dbo].[Program] AS p WITH (NOLOCK) 
WHERE
   p.CompKey = @_CompKey
UNION
SELECT UPPER(ps.User_ID) as User_ID 
FROM  
   [dbo].[Program] AS p WITH (NOLOCK) 
   INNER JOIN [dbo].[Program_Scenario] AS ps WITH (NOLOCK) ON p.ProgKey = ps.ProgKey
WHERE 
    p.CompKey = @_CompKey
    AND ps.User_ID IS NOT NULL
Sign up to request clarification or add additional context in comments.

1 Comment

Great perspective! Upon checking the Table Indexes, I see that both of these columns are already indexed under their respective tables. Changing up the join and getting rid of the distinct's DID help a slight amount. Thank you very much for rehashing this question, as it is still something I'm curious about.
1

I am taking a shot in the dark here. I am guessing that the last code you posted is still a scalar function. It also did not have all the logic of your original query. Again, this is a shot in the dark since there is no table definitions or sample data posted.

This might be how this would look as an inline table valued function.

ALTER FUNCTION [dbo].[Fn_Get_Del_User_ID] 
(
    @_CompKey INT
) RETURNS TABLE AS RETURN
    select MyResult = STUFF(
    (
        SELECT distinct UPPER(p.User_ID) as User_ID 
        FROM dbo.Program AS p 
        WHERE p.CompKey = @_CompKey
        group by p.User_ID

        UNION

        SELECT distinct UPPER(ps.User_ID) as User_ID 
        FROM dbo.Program AS p 
        LEFT OUTER JOIN dbo.Program_Scenario AS ps ON p.ProgKey = ps.ProgKey
        WHERE p.CompKey = @_CompKey
            AND ps.User_ID IS NOT NULL
        for xml path ('')
    ), 1, 1, '') 
    from dbo.Program

1 Comment

I was just seeing if I could actually write the STUFF portion of it.. this makes a lot of sense though. I'll play around with it and see if I can't post some test data if I can't figure it out. Thanks for all your help Sean.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.