0

vw_project is a view which involves 20 CTEs, join them multiple times and return 56 columns

many of these CTEs are self-joins (the classic "last row per group", in our case we get the last related object product / customer / manager per Project)

most of the tables (maybe 40 ?) involved don't exceed 1000 rows, the view itself returns 634 rows.

We are trying to improve the very bad performances of this view. We denormalized (went from TPT to TPH), and reduce by half the number of joins with almost no impact.

But i don't understand the following results i am obtaining :

select * from  vw_Project (TPT)
2 sec 

select * from  vw_Project (TPH)
2 sec 

select Id from vw_Project (TPH , TPT is instant)
6 sec

select 1 from vw_Project  (TPH , TPT is instant)
6 sec

select count(1) from vw_Project (TPH , TPT is instant)
6 sec

Execution plan for the last one (6 sec) : https://www.brentozar.com/pastetheplan/?id=r1DqRciBW

execution plan after sp_updatestats https://www.brentozar.com/pastetheplan/?id=H1Cuwsor-

To me, that seems absurd, I don't understand what's happening and it's hard to know whether my optimization strategies are relevant since I have no idea what justifies the apparently irrationnal behaviors I'm observing...

Any clue ?

11
  • 2
    Do you have an execution plan you can share with us? Commented Jul 18, 2017 at 14:39
  • 2
    You could try and paste the plan of one of those selects (for ex. the 6 sec ones) here for everyone to take a look. Commented Jul 18, 2017 at 14:40
  • What is TPT and TPH? Commented Jul 18, 2017 at 14:45
  • 1
    Could you try the same query with option(recompile)? What is the result? The estimates generally seem very very skewed. Like dbo.Circuit_Journal_Etape estimate is 1.6 rows and actual rows are 252K (?!) and of course there's a seek when maybe a scan would be better. Commented Jul 18, 2017 at 14:55
  • 1
    @Proviste Are you on a test environment? If so, you could maybe try to update statistics manually and see if that solves your problem. Commented Jul 18, 2017 at 15:07

1 Answer 1

1

CTE has no guarantee order to run the statements and 20 CTEs are far too many in my opinion. You can use OPTION (FORCE ORDER) to force execution from top to bottom.

For selecting few thousand rows however anything more than 1 sec is not acceptable regardless of complexity. I would choose an approach of a table function so i would have the luxury to create hash tables or table variables inside to have full control of each step. This way you limit the optimizer scope within each step alone.

Sign up to request clarification or add additional context in comments.

2 Comments

OPTION (FORCE ORDER) litteraly inverted results (6s become 2s and 2s become 6s)
Interesting. FORCE ORDER wont work alone - there needs to be a logical order of CTEs like WITH A , B, C... where B uses A and C uses B and so on. If top CTEs filter data well , lower CTEs will do better with fewer data. So try to change order - put good filter CTEs as top as possible.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.