SQL Server T-SQL query optimisation

Question

I have a T-SQL query and I want to make it faster.

I have Entity and Address tables, and wish to bring back an address if a mailing address exists.

Sometimes there are multiple addresses for any given entity. There is a primary mailing address tinyint that sometimes is set and sometimes not, there's no rules here there could be 5 default mailing addresses all the flag set or none with the flag set.

This runs at around 20 seconds for 11k rows I really need to get this time down, can anyone help?

SELECT 
   e.*, addr.*
FROM 
   [Entity] e
   --Address does not always exist
   --PrimaryAddress is a Not Null TinyInt, sometimes this flag is enable twice for a given entity.
LEFT OUTER JOIN 
   [Address] addr ON addr.[EntityID] = e.[EntityID] 
   AND addr.Code = 'MAILING'        
   AND addr.[AddressID] = (
       --This remove duplicates but add's a long delay(15 seconds) to execution time.
       SELECT Top 1 a.[AddressID]
       FROM [Address] AS a
       WHERE a.Code = 'MAILING'
         AND a.[EntityID] = e.[EntityID]    
       ORDER BY a.[PrimaryAddress] DESC)

It should also be noted that I can't add any indexes to the two tables either :(

Kind regards Simon Jackson

It's a 3rd party database and any modification is not "supported". — Simon
– Simon, Commented Oct 25, 2011 at 10:34
@marc_s, there are often many viable choices to performance tune without changing indexes. — HLGEM
– HLGEM, Commented Oct 25, 2011 at 13:37

Mikael Eriksson · Accepted Answer · 2011-10-25 10:12:25Z

1

This is a simplified version of your query that I think will return the same rows. (Not tested). I can't say if this will be faster than your version. You tell me.

SELECT 
    e.*,
    addr.*
FROM 
    [Entity] e
  OUTER APPLY (
                SELECT TOP(1) *
                FROM addr as a
                WHERE a.Code = 'MAILING'
                AND a.[EntityID] = e.[EntityID] 
                ORDER BY a.[PrimaryAddress] DESC
              ) as addr

answered Oct 25, 2011 at 10:12

Mikael Eriksson

139k22 gold badges223 silver badges293 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Simon Over a year ago

Thank you this has improved things, noticeably the first time it runs it was about 14 seconds, second time round, down to 2 seconds.

sll Over a year ago

@Simon : use DBCC FREEPROCCACHE and so on to cleanup cache before runs

Simon Over a year ago

DBCC FREEPROCCACHE, oh dear, 23 minutes and 20 seconds with the outer, I'll my original one now. There are a lot of layered views.

HLGEM Over a year ago

YOu have layered views? Oh dear, the vendor you bought this program from was incompetent weren't they? Have you considered buying a differnt program from a competent vendor?

Simon Over a year ago

OK, Here's the stats with this change using DBCC FREEPROCCACHE before each select. The original select, is consistently taking around 17 mins While the OUTER APPLY has cut it down to to an average of 8 seconds, it's a vast improvement.

|

roalz · Accepted Answer · 2017-03-30 08:50:20Z

1

You could stop using select *, you are returning the entity id twice and that is wasteful of both server and network resources. And do you honestly need every single one of the other fields? Eliminate any you don't need. Select * should not be used in production code anyway.

You have a correlated subquery which runs row by agonizing row, try using joins instead:

SELECT     e.*, addr.* 
FROM     [Entity] e     
LEFT JOIN   (SELECT addr.* 
            FROM  [Address] a
            JOIN     
                (SELECT Top 1 a.[AddressID]        
                FROM [Address] AS a        
                WHERE a.Code = 'MAILING'          
                AND a.[EntityID] = e.[EntityID]            
                ORDER BY a.[PrimaryAddress] DESC) dedup
                    ON a.address_id = dedup.address_id) addr 
    ON addr.[EntityID] = e.[EntityID]

And again don't use select *, I don't know your fields or I would have specified them above.

Of course the real way to fix this is to fix the badly designed database. It should not allow more than one primary address (we enforce this through a trigger), then you wouldn't need the expensive remove duplicates task. I realize in your case this isn't possible, but it might make someone else think about their design flaw. Since this is a third party product, I would request that they fix it to allow only one primary address. Eventually if enough people complain, they might.

edited Mar 30, 2017 at 8:50

roalz

2,8063 gold badges29 silver badges44 bronze badges

answered Oct 25, 2011 at 13:25

HLGEM

97k15 gold badges120 silver badges191 bronze badges

3 Comments

Simon Over a year ago

Thanks for the feedback I tested your joins, and it's taking 6 seconds on average :)

Simon Over a year ago

I only added the * to keep things simple and focus on the key fields. Even then, the table and field names used here do not reflect the real ones, if you saw what I was working with then I fear the answers would be about conventions rather than the issue. Thanks for your time and help.

Simon Over a year ago

I've marked this the answer as it offered the fastest performance increase. I do like @Mikael-Eriksson answer as well as its syntax is so simple, but it is a few seconds slower(in my query).

Andriy M · Accepted Answer · 2011-10-25 12:40:45Z

0

If you are on SQL Server 2005 or later version, you could try the following:

WITH ranked AS (
  SELECT
    *,
    rn = ROW_NUMBER() OVER (PARTITION BY EntityID ORDER BY [PrimaryAddress] DESC)
  FROM [Address]
  WHERE Code = 'MAILING'
)
SELECT
  e.*, a.*
FROM [Entity] e
  LEFT JOIN [Address] a ON a.[EntityID] = e.[EntityID] AND a.rn = 1

The result of this query would have one tiny difference over that of yours: there would be one additional column of rn with 1's and/or NULLs in it. I wouldn't consider it a problem, though, as masked SELECT lists are not recommended in production queries in the first place, and if that is a non-production script then one extra column will hardly be in the way.

References:

Ranking Functions (Transact-SQL)
- ROW_NUMBER (Transact-SQL)

WITH common_table_expression (Transact-SQL)
- Using Common Table Expressions

answered Oct 25, 2011 at 12:40

Andriy M

78k18 gold badges100 silver badges157 bronze badges

2 Comments

HLGEM Over a year ago

Or you could do this in a temp table instead of a CTE which can have the missing indexes put on it.

Simon Over a year ago

Tested this type of query, it got an average of 9 seconds. Thanks for sharing.

Collectives™ on Stack Overflow

SQL Server T-SQL query optimisation

3 Answers 3

6 Comments

3 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related