SQL query optimization in SQL Server

Question

In SQL Server, query execution plan, two operations (Parallelism and Hash Match) are getting 30 % and 45 % of the total cost.

What does it mean of Parallelism and Hash Match?

For Parallelism, I have checked on this link that Number of Parallelism can effect performance of query, how do I check what is number of degree of parallelism of server?

How to reduce this cost? I have no idea how can I reduce this cost.

My query is returning 42 million rows and joining 5 tables; no where conditions, no group by, order by clauses.

I have non-clustered indexes on join columns.

My query is:

SELECT
    [INV].SKU AS [INV_SKU],
    [INV].LOCATION_ID AS [INV_LOCATION_ID],
    [INV].DATE AS [INV_BALANCE_DATE],
    [INV].COST AS INV_COST,
    [ITEM].ITEM_ID,
    [ITEM].ITEM_NAME,
    [SPITEM].ITEMNumber As SP_ITEMID,
    [SPITEM].NAME,
    [SPITEM_DEPT].[SKEY],
    [SPITEM_DEPT].[DEPT_NAME],
    [Time].[DATE] AS [CAL_DATE],
    [Time].[CAL_NAME] AS [CAL_NAME],
    [Time].[YEAR_NAME] AS [YEAR_NAME],
    [Time].[YEAR_NUM] AS [YEAR_NUM],
    [Time].[YEAR_START_DT] AS [YEAR_START_DT],
    ISNUL(convert(INT, convert(varchar, [Time].[WEEK_END_DT], 112)), 0) AS [WEEK_END_DT_SKEY],
    CASE
       WHEN [ITEM].DEPARTMENT IS NULL
          THEN (CASE 
                   WHEN [SPITEM_DEPT].SPITEM_DEPT_NAME = 'UNSPECIFIED' 
                      THEN 0 
                   ELSE [SPITEM_DEPT].SPITEM_DEPT_NAME 
                END)
       ELSE [ITEM].DEPARTMENT 
    END AS [DEPARTMENTNUM],
    CASE 
       WHEN [ITEM].[DEPARTMENT_DESCRIPTION] IS NULL
          THEN [SPITEM_DEPT].[DESCRIPTION]   
          ELSE [ITEM].[DEPARTMENT_DESCRIPTION] 
    END AS [ITEM_DEPARTMENT_DESC],
    [LOCATION].LOCATION_NAME,
    [LOCATION].COUNTRY,
    [LOCATION].CURRENCY,
    [CURRENCY].BASE_CURRENCY
FROM 
    [dbo].[Table1] [INV]
LEFT JOIN 
    dbo.Table2 AS [ITEM] ON ([INV].SKU = [ITEM].SKU )
LEFT JOIN 
    dbo.Table3 AS [LOCATION] ON ([INV].LOCATION_ID = [LOCATION].LOCATION_ID)
INNER JOIN 
    dbo.Table4 [Time] ON ([INV].DATE = [Time].DATE)
LEFT JOIN 
    dbo.Table5 [SPITEM] ON ([INV].SKU = SPITEM.NAME)
LEFT JOIN 
    [dbo].[Table6] [SPITEM_DEPT] ON ([SPITEM].[WS_KPI_ITEM_MERCHANDISE_DEPARTMENT_SKEY] = [SPITEM_DEPT].[SKEY])
LEFT JOIN 
    [dbo].[Table7] [CURRENCY] ON ([INV].DATE BETWEEN [CURRENCY].BEGIN_DT AND [CURRENCY].END_DT AND [LOCATION].CURRENCY= [CURRENCY].LOCAL_CURRENCY)

and we have non-clustered index on joining columns too.

Please suggest possible solutions, I can try.

I am newbie to SQL Server query optimization.

Returning 42 milions of rows takes a lot of time, even if the query itself is super fast. You should ask yourself what do you need these rows for... — Marko Juvančič
– Marko Juvančič, Commented Dec 29, 2016 at 13:45
Don't fall into the trap of jumping straight to MAXDOP, excessive Parallelism is usually a symptom of Indexing problems. — pacreely
– pacreely, Commented Dec 29, 2016 at 14:00
Please use COALESCE() or ISNULL() instead of CASE WHEN ??? IS NULL THEN ... ;). — shA.t
– shA.t, Commented Dec 29, 2016 at 14:10

TheGameiswar · Accepted Answer · 2016-12-29 14:42:08Z

1

The number of rows is huge and for me the main problem is that you don't have any where conditions, even if you have indexes but not predicate is a bad scenario. The number of degree parallelism is set up by your DBA, it indicates the threshold for the operator will work in a parallel or serial way.

https://technet.microsoft.com/es-es/library/ms181007%28v=sql.105%29.aspx?f=255&MSPPError=-2147217396

I recommend you review this free book of Grant Fritchey about Execution Plan: https://www.red-gate.com/library/sql-server-execution-plans-2nd-edition

However, my pieces of advises are:
1) Adding filter in the query (Where)
2) Review your indexes strategy

edited Dec 29, 2016 at 14:42

TheGameiswar

29k9 gold badges67 silver badges106 bronze badges

answered Dec 29, 2016 at 14:06

Geovanny Hernandez

2371 silver badge10 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

pacreely · Accepted Answer · 2016-12-29 17:14:44Z

Do some tests with these indexes (not all at once).

Focus on your larger tables first.

Also in your Join to table7 try and change your BETWEEN statement to sample below

[INV].DATE >= [CURRENCY].BEGIN_DT 
AND 
[INV].DATE <= [CURRENCY].END_DT



CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table1](SKU)
CREATE NONCLUSTERED INDEX idx2 ON [dbo].[Table1](LOCATION_ID)
CREATE NONCLUSTERED INDEX idx3 ON [dbo].[Table1](DATE) INCLUDE(COST)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table2](SKU) INCLUDE(ITEM_ID,ITEM_NAME,DEPARTMENT,DEPARTMENT_DESCRIPTION)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table3](LOCATION_ID,CURRENCY) INCLUDE(LOCATION_NAME,COUNTRY)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table4](DATE) INCLUDE(CAL_NAME,YEAR_NAME,YEAR_NUM,YEAR_START_DT,WEEK_END_DT)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table5](NAME,WS_KPI_ITEM_MERCHANDISE_DEPARTMENT_SKEY) INCLUDE(ITEMNumber)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table6](SKEY) INCLUDE(DEPT_NAME,SPITEM_DEPT_NAME,DESCRIPTION)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table7](LOCAL_CURRENCY,BEGIN_DT,END_DT) INCLUDE(BASE_CURRENCY)

If you've exhausted all of your possible indexing options then you can always limit the number of processors your query uses by using the MAXDOP option at the base of your query. I'm not a fan of this because your systems can change over time and you'll have to keep an eye on it accordingly.

What would be interesting to see is if the HASH MATCH Actual Figures are massively different from the Estimated Figures, this might mean your statistics are out of date and your existing indexes need Rebuilding.

Collectives™ on Stack Overflow

SQL query optimization in SQL Server

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related