1

I’m working on a query to look for certain values in related tables. Let’s say I have a table TableA and bunch of other tables that have foreign keys linked back to TableA. I want to scan all of these related tables and check if at least 1 record with these values exist. I’m doing a query like the one below for about 7 tables (1 table can actually contain more than 1 foreign key relating it back to TableA):

SELECT 
TableA.Field1,
TableA.Field2,
Table1Join.TableBPrimaryKey,
Table2Join.TableCPrimaryKey,
Table3Join.TableDPrimaryKey,
Table4Join.TableEPrimaryKey,
Table5Join.TableFPrimaryKey,
Table6Join.TableGPrimaryKey,
Table7Join.TableHPrimaryKey,
Table8Join.TableIPrimaryKey
/* As more JOINs are added below which result in more fields listed here in 
the SELECT statement, the query slows down by a lot. If I simply add JOIN’s 
without referencing their fields in the SELECT statement here, it runs fast 
*/
FROM
TableA 
/* Scan 1st related table */
INNER JOIN 
 (SELECT
      TableAAlias.PrimaryKey,
      MAX(TableB.PrimaryKey) AS TableBPrimaryKey
  FROM 
      TableA TableAAlias
  LEFT OUTER JOIN 
      TableB ON TableB.ForeignKey = TableAAlias.PrimaryKey
  GROUP BY 
      TableAAlias.PrimaryKey) AS Table1Join ON Table1Join.PrimaryKey = TableA.PrimaryKey
/* Scan 2nd related table */
INNER JOIN 
 (SELECT
     TableAAlias.PrimaryKey,
     MAX(TableC.PrimaryKey) AS TableBPrimaryKey
 FROM 
     TableA TableAAlias
 LEFT OUTER JOIN 
     TableC ON TableC.ForeignKey = TableAAlias.PrimaryKey
 GROUP BY 
     TableAAlias.PrimaryKey) AS Table2Join ON Table2Join.PrimaryKey = TableA.PrimaryKey
/* Scan 3rd related table */
INNER JOIN 
(SELECT
     TableAAlias.PrimaryKey,
     MAX(TableD.PrimaryKey) AS TableDPrimaryKey
 FROM 
     TableA TableAAlias
 LEFT OUTER JOIN 
     TableD ON TableD.ForeignKey = TableAAlias.PrimaryKey
 GROUP BY 
     TableAAlias.PrimaryKey) AS Table3Join ON Table3Join.PrimaryKey = TableA.PrimaryKey
/* Scan 4th related table */
INNER JOIN 
(SELECT
     TableAAlias.PrimaryKey,
     MAX(TableE.PrimaryKey) AS TableEPrimaryKey
 FROM 
     TableA TableAAlias
 LEFT OUTER JOIN 
     TableE ON TableE.ForeignKey = TableAAlias.PrimaryKey
 GROUP BY 
     TableAAlias.PrimaryKey) AS Table4Join ON Table4Join.PrimaryKey = TableA.PrimaryKey
/* Scan 5th related table */
INNER JOIN 
(SELECT
     TableAAlias.PrimaryKey,
     MAX(TableF.PrimaryKey) AS TableFPrimaryKey
 FROM 
     TableA TableAAlias
 LEFT OUTER JOIN 
     TableF ON TableF.ForeignKey = TableAAlias.PrimaryKey
 GROUP BY 
     TableAAlias.PrimaryKey) AS Table5Join ON Table5Join.PrimaryKey = TableA.PrimaryKey
/* Scan 6th related table */
INNER JOIN 
(SELECT
     TableAAlias.PrimaryKey,
     MAX(TableG.PrimaryKey) AS TableGPrimaryKey
 FROM 
     TableA TableAAlias
 LEFT OUTER JOIN 
     TableG ON TableG.ForeignKey = TableAAlias.PrimaryKey
 GROUP BY 
     TableAAlias.PrimaryKey) AS Table6Join ON Table6Join.PrimaryKey = TableA.PrimaryKey
/* Scan 7th related table */
INNER JOIN 
(SELECT
     TableAAlias.PrimaryKey,
     MAX(TableH.PrimaryKey) AS TableHPrimaryKey
 FROM 
     TableA TableAAlias
 LEFT OUTER JOIN 
     TableH ON TableH.ForeignKey = TableAAlias.PrimaryKey
 GROUP BY 
     TableAAlias.PrimaryKey) AS Table7Join ON Table7Join.PrimaryKey = TableA.PrimaryKey
/* Scan 8th related table */
INNER JOIN 
(SELECT
     TableAAlias.PrimaryKey,
     MAX(TableI.PrimaryKey) AS TableIPrimaryKey
 FROM 
     TableA TableAAlias
 LEFT OUTER JOIN 
     TableI ON TableI.ForeignKey = TableAAlias.PrimaryKey
 GROUP BY 
     TableAAlias.PrimaryKey) AS Table8Join ON Table8Join.PrimaryKey = TableA.PrimaryKey

Most of these tables contain over 100,000 records. When I don’t use fields derived from joins in the main SELECT statement, the query runs relatively fast (about 25 seconds). When I use some of the fields, it runs for about the same time but as I add more fields derived from the JOINs, the query halts to a crawl and may run for hours. There’s no rhyme or reason to this as I cannot catch what’s causing this. I can add a field and then it starts running slow, then I remove another field that was there before when it ran fast and add the one that seemingly caused the issue and it runs fast again. I can of course break this query down into 7 individual queries and create a temp-table (which I have done trying to find the cause), in which case it runs relatively fast, but I cannot use temp tables. I understand the query is probably not optimized, but I’m not a SQL guru so I’m not sure where to begin to optimize its performance.

4
  • 2
    You showed us a query that works OK (<25 seconds), but you want to speed up another query (with additional fields) that you did not show in the question. Please, show the querythat works slowly. Commented Nov 16, 2018 at 16:47
  • @krokodilko It's really the same thing with identical joins to different tables. Commented Nov 16, 2018 at 17:10
  • The same but completely different. So you posted a query that isn't the one that is the issue and you provided no details about the tables and expect us to help improve the performance. Give us the information to answer the question. Commented Nov 16, 2018 at 17:14
  • @SeanLange I just didn't want to complicate this and confuse others, but you're right...it's better to post the whole thing. I edited my question. Commented Nov 16, 2018 at 17:50

1 Answer 1

1

How does this perform compared to your original query?

SELECT 
    TableA.Field1,
    TableA.Field2,
    (
        select MAX(TableB.PrimaryKey)
        from TableB
        where TableB.ForeignKey = TableA.PrimaryKey
    ) TableBPrimaryKey,
    (
        select MAX(TableC.PrimaryKey)
        from TableC
        where TableC.ForeignKey = TableA.PrimaryKey
    ) TableCPrimaryKey
FROM TableA 

NB: From the code shared, this is functionally identical... If the code shared in the question doesn't match your real code, you'll need to share that code for us to help further.


Update per comments

If you just want some way of showing that any table contains a relation to the main table, try this:

Select TableA.*
, (

    select top 1 'TableB' from TableB where TableB.ForeignKey = TableA.PrimaryKey

    union all

    select top 1 'TableC' from TableC where TableC.ForeignKey = TableA.PrimaryKey

    --etc
) AnyTableHasValue --gives the name of the first table with a match  
From Table A

Another approach would be:

Select TableA.*
case when x.ForeignKey is null then 0 else 1 end ExistInOtherTable
From TableA
left outer join 
(
    select ForeignKey from TableB 

    union --no all, so avoids duplicates / ensures at most 1 match

    select ForeignKey from TableC 
) x
on x.ForeignKey = TableA.PrimaryKey
Sign up to request clarification or add additional context in comments.

7 Comments

Not only this worked beautifully, but it also cut down the run time to 15 sec from 25! Apologies for not describing the whole issue, but I also need to have a column in the output of this query that will basically have 0/1 values and calculate if TableBPrimaryKey is NULL AND TableCPrimaryKey is NULL AND TableDPrimaryKey is NULL etc. I just need to output a value that will indicate whether or not any of the related tables hold values from the main table. Is this possible to do in the same query? Thank you very much!
Figured this out. Just wrapped 1 select into another one. Here's the final query: SELECT *, CASE WHEN d. TableBPrimaryKey is null and d. TableCPrimaryKey is null then 1 else 0 end as ‘NotFound’ FROM (SELECT TableA.Field1, TableA.Field2, ( select MAX(TableB.PrimaryKey) from TableB where TableB.ForeignKey = TableA.PrimaryKey ) TableBPrimaryKey, ( select MAX(TableC.PrimaryKey) from TableC where TableC.ForeignKey = TableA.PrimaryKey ) TableCPrimaryKey FROM TableA) d
John, this one is lot slower actually. I believe when you do TOP 1, it does whole table scan for each related record even if it finds a match (I could be wrong).
John, yes, all of the fields in the related tables are defines as foreign keys. I assume that foreign keys are all indexed. Anyway, your 1st suggestion worked like a charm. Thank you again!
...and to add to that...without any of the joins (just straight dump of couple of fields from the master table) the query takes 3 sec. and 15 sec. after 'joining' 12 fairly large tables. 1 sec/table is a damn good performance to me :-)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.