0

In SQL Server, query execution plan, two operations (Parallelism and Hash Match) are getting 30 % and 45 % of the total cost.

What does it mean of Parallelism and Hash Match?

For Parallelism, I have checked on this link that Number of Parallelism can effect performance of query, how do I check what is number of degree of parallelism of server?

How to reduce this cost? I have no idea how can I reduce this cost.

enter image description here

enter image description here

My query is returning 42 million rows and joining 5 tables; no where conditions, no group by, order by clauses.

I have non-clustered indexes on join columns.

My query is:

SELECT
    [INV].SKU AS [INV_SKU],
    [INV].LOCATION_ID AS [INV_LOCATION_ID],
    [INV].DATE AS [INV_BALANCE_DATE],
    [INV].COST AS INV_COST,
    [ITEM].ITEM_ID,
    [ITEM].ITEM_NAME,
    [SPITEM].ITEMNumber As SP_ITEMID,
    [SPITEM].NAME,
    [SPITEM_DEPT].[SKEY],
    [SPITEM_DEPT].[DEPT_NAME],
    [Time].[DATE] AS [CAL_DATE],
    [Time].[CAL_NAME] AS [CAL_NAME],
    [Time].[YEAR_NAME] AS [YEAR_NAME],
    [Time].[YEAR_NUM] AS [YEAR_NUM],
    [Time].[YEAR_START_DT] AS [YEAR_START_DT],
    ISNUL(convert(INT, convert(varchar, [Time].[WEEK_END_DT], 112)), 0) AS [WEEK_END_DT_SKEY],
    CASE
       WHEN [ITEM].DEPARTMENT IS NULL
          THEN (CASE 
                   WHEN [SPITEM_DEPT].SPITEM_DEPT_NAME = 'UNSPECIFIED' 
                      THEN 0 
                   ELSE [SPITEM_DEPT].SPITEM_DEPT_NAME 
                END)
       ELSE [ITEM].DEPARTMENT 
    END AS [DEPARTMENTNUM],
    CASE 
       WHEN [ITEM].[DEPARTMENT_DESCRIPTION] IS NULL
          THEN [SPITEM_DEPT].[DESCRIPTION]   
          ELSE [ITEM].[DEPARTMENT_DESCRIPTION] 
    END AS [ITEM_DEPARTMENT_DESC],
    [LOCATION].LOCATION_NAME,
    [LOCATION].COUNTRY,
    [LOCATION].CURRENCY,
    [CURRENCY].BASE_CURRENCY
FROM 
    [dbo].[Table1] [INV]
LEFT JOIN 
    dbo.Table2 AS [ITEM] ON ([INV].SKU = [ITEM].SKU )
LEFT JOIN 
    dbo.Table3 AS [LOCATION] ON ([INV].LOCATION_ID = [LOCATION].LOCATION_ID)
INNER JOIN 
    dbo.Table4 [Time] ON ([INV].DATE = [Time].DATE)
LEFT JOIN 
    dbo.Table5 [SPITEM] ON ([INV].SKU = SPITEM.NAME)
LEFT JOIN 
    [dbo].[Table6] [SPITEM_DEPT] ON ([SPITEM].[WS_KPI_ITEM_MERCHANDISE_DEPARTMENT_SKEY] = [SPITEM_DEPT].[SKEY])
LEFT JOIN 
    [dbo].[Table7] [CURRENCY] ON ([INV].DATE BETWEEN [CURRENCY].BEGIN_DT AND [CURRENCY].END_DT AND [LOCATION].CURRENCY= [CURRENCY].LOCAL_CURRENCY)

and we have non-clustered index on joining columns too.

Please suggest possible solutions, I can try.

I am newbie to SQL Server query optimization.

8
  • 4
    Returning 42 milions of rows takes a lot of time, even if the query itself is super fast. You should ask yourself what do you need these rows for... Commented Dec 29, 2016 at 13:45
  • 1
    Can you provide the execution plan of your query? Commented Dec 29, 2016 at 13:49
  • 3
    Share your execution plans using brentozar.com/pastetheplan Commented Dec 29, 2016 at 13:59
  • 1
    Don't fall into the trap of jumping straight to MAXDOP, excessive Parallelism is usually a symptom of Indexing problems. Commented Dec 29, 2016 at 14:00
  • 1
    Please use COALESCE() or ISNULL() instead of CASE WHEN ??? IS NULL THEN ... ;). Commented Dec 29, 2016 at 14:10

2 Answers 2

1

The number of rows is huge and for me the main problem is that you don't have any where conditions, even if you have indexes but not predicate is a bad scenario. The number of degree parallelism is set up by your DBA, it indicates the threshold for the operator will work in a parallel or serial way.

https://technet.microsoft.com/es-es/library/ms181007%28v=sql.105%29.aspx?f=255&MSPPError=-2147217396

I recommend you review this free book of Grant Fritchey about Execution Plan: https://www.red-gate.com/library/sql-server-execution-plans-2nd-edition

However, my pieces of advises are:
1) Adding filter in the query (Where)
2) Review your indexes strategy

Sign up to request clarification or add additional context in comments.

Comments

0

Do some tests with these indexes (not all at once).

Focus on your larger tables first.

Also in your Join to table7 try and change your BETWEEN statement to sample below

[INV].DATE >= [CURRENCY].BEGIN_DT 
AND 
[INV].DATE <= [CURRENCY].END_DT



CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table1](SKU)
CREATE NONCLUSTERED INDEX idx2 ON [dbo].[Table1](LOCATION_ID)
CREATE NONCLUSTERED INDEX idx3 ON [dbo].[Table1](DATE) INCLUDE(COST)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table2](SKU) INCLUDE(ITEM_ID,ITEM_NAME,DEPARTMENT,DEPARTMENT_DESCRIPTION)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table3](LOCATION_ID,CURRENCY) INCLUDE(LOCATION_NAME,COUNTRY)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table4](DATE) INCLUDE(CAL_NAME,YEAR_NAME,YEAR_NUM,YEAR_START_DT,WEEK_END_DT)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table5](NAME,WS_KPI_ITEM_MERCHANDISE_DEPARTMENT_SKEY) INCLUDE(ITEMNumber)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table6](SKEY) INCLUDE(DEPT_NAME,SPITEM_DEPT_NAME,DESCRIPTION)

CREATE NONCLUSTERED INDEX idx1 ON [dbo].[Table7](LOCAL_CURRENCY,BEGIN_DT,END_DT) INCLUDE(BASE_CURRENCY)

If you've exhausted all of your possible indexing options then you can always limit the number of processors your query uses by using the MAXDOP option at the base of your query. I'm not a fan of this because your systems can change over time and you'll have to keep an eye on it accordingly.

What would be interesting to see is if the HASH MATCH Actual Figures are massively different from the Estimated Figures, this might mean your statistics are out of date and your existing indexes need Rebuilding.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.