Using tuple comparison in mysql is it efficient?

Question

I have a table of books :

CREATE TABLE `books` (
    `id` INT(11) NOT NULL AUTO_INCREMENT,
    `nameOfBook` VARCHAR(32),
    `releaseDate` DATETIME NULL DEFAULT NULL,
    PRIMARY KEY (`id`),
    INDEX `Index 2` (`releaseDate`, `id`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB

AUTO_INCREMENT=33029692;

I compared two SQL requests to do a pagiation with sort on releaseDate. Both of theses request return the same result.

(simple one)

select SQL_NO_CACHE  id,name, releaseDate  
from books  
where releaseDate <= '2016-11-07'  
AND (releaseDate<'2016-11-07' OR id <    3338191)  
ORDER  by releaseDate DESC, id DESC limit 50;

and

(tuple comparison or row comparaison)

select SQL_NO_CACHE  id,name, releaseDate 
from books 
where (releaseDate ,id) < ('2016-11-07',3338191) 
ORDER  by releaseDate DESC, id DESC limit 50;

When I do the explain of the request i got this

simple one :

"id";"select_type";"table";"type";"possible_keys";"key";"key_len";"ref";"rows";"Extra"
"1";"SIMPLE";"books";"range";"PRIMARY,Index 2";"Index 2";"9";"";"1015876";"Using where; Using index"

We can see it is parsing "1015876" of rows

The explain for the tuple comparison :

"id";"select_type";"table";"type";"possible_keys";"key";"key_len";"ref";"rows";"Extra"
"1";"SIMPLE";"books";"index";"";"Index 2";"13";"";"50";"Using where; Using index"

We can see it is parsing "50" of rows.

But if I checked the exectution time the simple one :

/* Affected rows: 0  Lignes trouvées: 50  Avertissements: 0  Durée pour 1 query: 0,031 sec. */

and the tuple one :

/* Affected rows: 0  Lignes trouvées: 50  Avertissements: 0  Durée pour 1 query: 3,682 sec. */

I don't understant why according to the explain the tuple comparison is better but the execution time is badly worse?

The execution plan is the "path" the optimizer will choose, the number of rows parsed doesn't necessarily affect the outcome . — sagi
– sagi, Commented Nov 8, 2016 at 14:33
according to my result test the tuple comparison is less performant than the simple one. Is it a bad practice to use tuple comparison then? — dop
– dop, Commented Nov 8, 2016 at 15:57
I'm looking for authoritative description on how tuple comparison uses matching compound indices as well as practical performance. — Dima Tisnek
– Dima Tisnek, Commented Apr 16, 2017 at 11:50
@qarma For what it's worth.. an index on (releaseDate, id) is effectively the same as an index on just releaseDate in this case. See Docs, "In InnoDB, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index." — Arth
– Arth, Commented Apr 21, 2017 at 11:22
@Hannes - Alas, one of those half-implemented features in MySQL -- syntax and functionality is there, but performance is not. — Rick James
– Rick James, Commented Apr 23, 2017 at 18:45

jfg956 · Accepted Answer · 2021-10-07 19:29:28Z

16

+100

I've been irritated by this for years. WHERE (a,b) > (1,2) has never been optimized, in spite of it being easily transformed into the other formulation. Even the other format was poorly optimized until a few years ago.

Using EXPLAIN FORMAT=JSON SELECT ... might give you some better clues.

Meanwhile, EXPLAIN ignored the LIMIT and suggested 1015876. On many cases, EXPLAIN provides a "decent" Row estimate, but not either of these.

Feel free to file a bug report: http://bugs.mysql.com (and post the link here).

Another formulation was recently optimized, in spite of OR being historically un-optimizable.

where  releaseDate <  '2016-11-07'  
   OR (releaseDate  = '2016-11-07' AND id < 3338191)

For measuring query optimizations, I like to do:

FLUSH STATUS;
SELECT ...
SHOW SESSION STATUS LIKE 'Handler%';

Small values, such as '50' for your case, indicate good optimization; large value (1M) indicate a scan. The Handler numbers are exact; unlike the estimates in EXPLAIN.

Update 5.7.3 has improved handling of tuples, aka "row constructors"

Update MySQL Bug#104128 covers this.

edited Oct 7, 2021 at 19:29

jfg956

16.9k4 gold badges30 silver badges35 bronze badges

answered Apr 17, 2017 at 19:12

Rick James

144k15 gold badges144 silver badges255 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

Paul Spiegel Over a year ago

This is the same as they recommend here. But the example there is a bit different. Reading it, one might assume, that the query in this question should work fine, since the condition here does cover the index.

Rick James Over a year ago

I read that page as "here's some nifty syntax", not "this will perform well".

Dima Tisnek Over a year ago

Methinks there are 2 ways to rewrite (a,b)>(1,2): namely a>1 or a=1 and b>2 and also a>=1 and (a>1 or b>2) where at least index on a can always be utilised (under some assumptions of data density in a and b).

Rick James Over a year ago

@qarma - I previously did some experiments with 'Handler%'; it turned out that INDEX(a,b) worked well for either of the rewrites. Please try it on your version to confirm. (And state which MySQL/MariaDB version you are running.)

Dima Tisnek Over a year ago

I'm only interested in newest open source MySQL version, that's 5.7.x

|

Collectives™ on Stack Overflow

Using tuple comparison in mysql is it efficient?

1 Answer 1

10 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related