1

I need to write a SQL query and want to check if this is the optimized approach. If not, what would be a better query for this scenario?

All the joining tables are pretty big(approx 150million records).

Below is the sample code.

DECLARE @inputIds VARCHAR(50) = NULL; -- This can have a comma-separated list of ids as well.

DECLARE @inputIdsVar TABLE (ID INT);
INSERT INTO @inputIdsVar SELECT DISTINCT CAST(value AS INT) FROM fnSplit(@inputIds, ','); -- Some custom function to split the Ids list.

SELECT * FROM TABLE1 tbl1
JOIN TABLE2 tbl2 ON tbl1.JoinIndex1 = tbl2.JoinIndex1 OR @inputVar IS NULL
JOIN TABLE2 tbl3 ON tbl2.JoinIndex2 = tbl3.JoinIndex2 OR @inputVar IS NULL
JOIN @inputIdsVar iiv ON tbl3.ID = iiv.ID OR @inputVar IS NULL
WHERE tbl1.Key = 'blah'

Here, my expected output is as follows:-

  1. If @inputVar is NULL, return all records from TABLE1.
  2. If @inputVar is NOT NULL, join with 2 other tables and return only the filtered records. Join with 2 other tables is required because the passed-in variable is a comma-separated id list which is present in TABLE3.

Any help would be appreciated.

2
  • But inputIdsVar is a TABLE variable Commented Feb 18, 2020 at 0:39
  • Oups my bad, OK Commented Feb 18, 2020 at 0:39

2 Answers 2

1

The easiest way to do this is going to be to create two queries and UNION ALL them together. Each "half" of the UNION ALL will handle the possible values of @inputVar. One query for when the variable is NULL, and one for when it isn't.

DECLARE @inputIds VARCHAR(50) = NULL; -- This can have a comma-separated list of ids as well.

DECLARE @inputIdsVar TABLE (ID INT);
INSERT INTO @inputIdsVar SELECT DISTINCT CAST(value AS INT) FROM fnSplit(@inputIds, ','); -- Some custom function to split the Ids list.

SELECT 
    tbl1.JoinIndex1
    ,JoinIndex2 = NULL
    ,JoinIndex3 = NULL
FROM TABLE1 tbl1
WHERE @inputVar IS NULL
    AND tbl1.Key = 'blah'

UNION ALL

SELECT 
    tbl1.JoinIndex1
    ,tbl2.JoinIndex2
    ,tbl3.JoinIndex3
FROM TABLE1 tbl1
JOIN TABLE2 tbl2 ON tbl1.JoinIndex1 = tbl2.JoinIndex1 
JOIN TABLE2 tbl3 ON tbl2.JoinIndex2 = tbl3.JoinIndex2 
JOIN @inputIdsVar iiv ON tbl3.ID = iiv.ID 
WHERE @inputVar IS NOT NULL
    AND tbl1.Key = 'blah'

You should only ever get records from the first half or the second half of the UNION ALL. You'll also want to make sure your column list is the same in both halves. If you need any column names in the first half of the query that are only present in TABLE2, you can just stick a NULL with the correct alias. Notice the JoinIndex2 = NULL in the first half.

Sign up to request clarification or add additional context in comments.

Comments

0

Digital aaron is on the right track, because OR kills performance. However, you are using select *, so you need to be careful about the columns.

I think the equivalent query with better performance is:

SELECT *
FROM TABLE1 tbl1 LEFT JOIN
     TABLE2 tbl2
     ON 1 = 0 LEFT JOIN  -- always false 
     TABLE2 tbl3
     ON 1 = 0 LEFT JOIN
     @inputIdsVar iiv
     ON 1 = 0 
WHERE tbl1.Key = 'blah'
UNION ALL
SELECT *
FROM TABLE1 tbl1 JOIN
     TABLE2 tbl2
     ON tbl1.JoinIndex1 = tbl2.JoinIndex1 AND
        @inputVar IS NOT NULL JOIN
     tbl3
     ON tbl2.JoinIndex2 = tbl3.JoinIndex2 AND
        @inputVar IS NOT NULL JOIn
     @inputIdsVar iiv
     ON tbl3.ID = iiv.ID AND @inputVar IS NOT NULL
WHERE tbl1.Key = 'blah';

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.