Postgresql: Seq Scan instead of Index Scan

Question

I have following table:

create table if not exists inventory
(
    expired_at timestamp(0),
    -- ...
);

create index if not exists inventory_expired_at_index
    on inventory (expired_at);

However when I run following query:

EXPLAIN UPDATE "inventory" SET "status" = 'expired' WHERE "expired_at" < '2020-12-08 12:05:00';

I get next execution plan:

Update on inventory  (cost=0.00..4.09 rows=2 width=126)
  ->  Seq Scan on inventory  (cost=0.00..4.09 rows=2 width=126)
        Filter: (expired_at < '2020-12-08 12:05:00'::timestamp without time zone)

Same happens for big dataset:

EXPLAIN SELECT * FROM "inventory"  WHERE "expired_at" < '2020-12-08 12:05:00';
-[ RECORD 1 ]---------------------------------------------------------------------------
QUERY PLAN | Seq Scan on inventory  (cost=0.00..58616.63 rows=1281058 width=71)
-[ RECORD 2 ]---------------------------------------------------------------------------
QUERY PLAN |   Filter: (expired_at < '2020-12-08 12:05:00'::timestamp without time zone)

The question is: why not Index Scan but Seq Scan?

The important information is still missing: the number of rows that were not updated. We need EXPLAIN (ANALYZE, BUFFERS) output. (But beware: the ANALYZE makes that the statement is executed and the rows modified. Do it in a transaction that you roll back.) — Laurenz Albe
– Laurenz Albe, Commented Dec 8, 2020 at 12:43

Gordon Linoff · Accepted Answer · 2020-12-08 14:03:39Z

3

This is a bit long for a comment.

The short answer is that you have two rows in the table, so it doesn't make a difference.

The longer answer is that your are using an update, so the data rows have to be retrieved anyway. Using an index requires loading both the index and the data rows and then indirecting from the index to the data rows. It is a little more complicated. And with two rows, not worth the effort at all.

The power of indexes is to handle large amounts of data, not small amounts of data.

To respond to the large question: Database optimizers are not required to use an index. They use some sort of measures (often cost-based optimization) to determine whether or not an index is appropriate. In your larger example, the optimizer has determined that the index is not appropriate. This could happen if the statistics are out-of-synch with the underlying data.

edited Dec 8, 2020 at 14:03

answered Dec 8, 2020 at 12:24

Gordon Linoff

1.3m62 gold badges705 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Rudziankoŭ Over a year ago

added example with big dataset

Collectives™ on Stack Overflow

Postgresql: Seq Scan instead of Index Scan

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related