0

i try to do a benchmark for SQL Statments for SQLServer.

I found a good benchmark loop online: https://github.com/jOOQ/jOOQ/blob/master/jOOQ-examples/Benchmarks/SQLServer/Benchmarking%20SQL%20Server%20(absolute).sql

DECLARE @ts DATETIME;
DECLARE @repeat INT = 10000;
DECLARE @r INT;
DECLARE @i INT;
DECLARE @dummy VARCHAR;

DECLARE @s1 CURSOR;
DECLARE @s2 CURSOR;

SET @r = 0;
WHILE @r < 5
BEGIN
  SET @r = @r + 1

  SET @s1 = CURSOR FOR
  -- Paste statement 1 here
  SELECT 1 x;

  SET @s2 = CURSOR FOR
  -- Paste statement 2 here
  WITH t(v) AS (
    SELECT 1
    UNION ALL
    SELECT v + 1 FROM t WHERE v < 10
  )
  SELECT * FROM t

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s1;
    FETCH NEXT FROM @s1 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s1 INTO @dummy;
    END;

    CLOSE @s1;
  END;

  DEALLOCATE @s1;
  PRINT 'Run ' + CAST(@r AS VARCHAR) + ', Statement 1: ' + CAST(DATEDIFF(ms, @ts, current_timestamp) AS VARCHAR) + 'ms';

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s2;
    FETCH NEXT FROM @s2 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s2 INTO @dummy;
    END;

    CLOSE @s2;
  END;

  DEALLOCATE @s2;
  PRINT 'Run ' + CAST(@r AS VARCHAR) + ', Statement 2: ' + CAST(DATEDIFF(ms, @ts, current_timestamp) AS VARCHAR) + 'ms';
END;

PRINT '';
PRINT 'Copyright Data Geekery GmbH';
PRINT 'https://www.jooq.org/benchmark';

This works great for when the statments i test only have one column they return. For example:

Select ID from Items Where ID=2;

But as soon as i try to select multiple rows like

Select * from Items Where ID=2;

i get the error:

Msg 16924, Level 16, State 1, Line 135 Cursorfetch: The number of variables declared in the INTO list must match that of selected columns.

So the column this concerns is

FETCH NEXT FROM @s1 INTO @dummy;

So as far as i understand the issue is that i try to fit to much columsn into the dummy variable. But how do i fix it? Im not so long working with SQL so any help would be appreciated.

12
  • The error message is pretty straight forward - youre pulling more data than you have variables for. If you need to pull more columns then you need to have a variable for each. Commented Jan 8, 2020 at 13:47
  • So for 3 columns i need to have dumm1 dumm2 and dummy 3? Or is there a way where i just declare one array with 3 items? Because i want to test a query with 63 columsn and thats kinda tedious. Commented Jan 8, 2020 at 13:49
  • 1
    You know that was meant as an example, right? Did you bother to google any of the things I said? Like "Client Statistics" or "Execution Plan"? Youre literally in front of a computer :P Query optimization is a very broad topic and requires a bigger thinking perspective sometimes. It ranges from Index Analysis, SQL Syntax, Query Structure, Waits, Blocks/Locks, Resource Considerations, Execution Plan analysis and what I might consider common sense. Focusing purely on timing might be shot sighted at best and ignoring all the fun parts. Commented Jan 8, 2020 at 14:34
  • 2
    Your approach to testing performance is flawed critically. You have introduced nested cursors. Commented Jan 8, 2020 at 14:48
  • 1
    Well geez @SeanLange I was trying to be nice about it LOL Commented Jan 8, 2020 at 15:04

1 Answer 1

1

That is not a useful or simple way to test queries.

It’s a lot of code so it’s not particularly easy, and uses a cursor to process the results so it includes the cost of processing the results on the server with a cursor, which is not present normally.

Normally you just run the query in SSMS and look at the Actual Execution Plan and perhaps the time and IO statistics, and perhaps the client statistics. The query results should be returned to the client, because that's what happens in production, and you should consider the time needed to transmit results over the network when benchmarking.

If you need to run a query without returning data to the client, you can use a pattern like

go
set statistics io on
set statistics time on
go
drop table if exists #foo;

with q as
(
  select ...
  from ...
)
select *
into #foo
from q
go 5
set statistics io off
set statistics time off
go
Sign up to request clarification or add additional context in comments.

6 Comments

The point of such benchmarks is to run a query 100s or 1000s of times to get averages, standard deviations, etc. in order to reduce side effects of an otherwise busy server, or startup penalties, caching, etc.
Yes @LukasEder this is exactly my goal what i try to reach. And it works good. Only problem is when i want to select multiple columns as output.
Trying to learn something here. How is this approach significantly better than the OP's? I mean, the OP could add the set statistics calls to the script just the same... This approach here seems to include the overhead of managing the temporary table, and the associated I/O... Your answer isn't very specific with respect to "That is not a useful or simple way to test queries."
See update above. And the temp table is only there if you don’t want to return the results to the client. Normally you would return the results to the client because that’s what a real query does.
@DavidBrowne-Microsoft: I think you're missing the point of the benchmark. If transmission time was relevant, of course, the same benchmark could be written from a client, e.g. written in C# or Java, or whatever. But it is also interesting to benchmark execution time from within a database only. I mean, the "client" could as well be implemented in T-SQL, in case of which this approach would be perfectly valid. I'd love to learn what the big flaw really is here.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.