I have a huge table with a primary key and a btree_gist index. When I query for the columns in the btree_gist index, I would expect that the index is used and the query performs quite fast. However, the optimiser always performs an index scan on the primary key and the filters.
Example:
create table test1 (
id1 bigint not null,
id2 bigint not null,
validtime tstzrange not null,
data float);
alter table test1 add constraint pk_test1 primary key (id1, id2, validtime);
alter table test1 add constraint ex_test1_validtime exclude using gist (id1 with =, id2 with =, validtime with &&);
The table contains about 1.2 billion rows, the query I wonder about returns only some hundred rows, but takes a long time:
select * from test1 where id1=1 and id2=1 and validtime && '[2020-01-01,2020-02-01)';
(about 3s)
Query plan:
explain select * from test1 where id1=1 and id2=1 and validtime && '[2020-01-01,2020-02-01)';
QUERY PLAN
-------------------------------------------------------------------------------------------
Index Scan using pk_test1 on test1 (cost=0.70..24.68 rows=1 width=46)
Index Cond: ((id1 = 1) AND (id2 = 1))
Filter: (validtime && '["2020-01-01 00:00:00+00","2020-02-01 00:00:00+00")'::tstzrange)
The reason for the bad performance is obviously that ten thousands of rows are read and filtered in the time criteria.
I wonder why postgres does not use the btree_gist.
I have another, slightly different table where the btree_gist is used but in a very different way than I would expect. This table has about 160 million rows.
create table test2 (
id1 bigint not null,
validtime tstzrange not null);
alter table test2 add constraint pk_test2 primary key (id1, validtime);
alter table test2 add constraint ex_test2_validtime exclude using gist (id1 with =, validtime with &&);
Here, the execution plan looks like this:
select * from test2 where id1=1 and validtime && '[2020-01-01,2020-02-01)';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test2 (cost=1933.19..1937.20 rows=1 width=62)
Recheck Cond: ((id1 = 1) AND (validtime && '["2020-01-01 00:00:00+00","2020-02-01 00:00:00+00")'::tstzrange))
-> BitmapAnd (cost=1933.19..1933.19 rows=1 width=0)
-> Bitmap Index Scan on pk_test2 (cost=0.00..574.20 rows=11417 width=0)
Index Cond: (id1 = 1)
-> Bitmap Index Scan on ex_test2_validtime (cost=0.00..1358.74 rows=17019 width=0)
Index Cond: (validtime && '["2020-01-01 00:00:00+00","2020-02-01 00:00:00+00")'::tstzrange)
Why the two bitmap index scans, couldn't it all be done with one index scan using the btree_gist index?
EXPLAIN (ANALYZE, BUFFERS)output. About your first query: is the desired index used if you drop the other one?