Why does PostgreSQL perform sequential scan on indexed column?

Question

Very simple example - one table, one index, one query:

CREATE TABLE book
(
  id bigserial NOT NULL,
  "year" integer,
  -- other columns...
);

CREATE INDEX book_year_idx ON book (year)

EXPLAIN
 SELECT *
   FROM book b
  WHERE b.year > 2009

gives me:

Seq Scan on book b  (cost=0.00..25663.80 rows=105425 width=622)
  Filter: (year > 2009)

Why it does NOT perform index scan instead? What am I missing?

user330315 · Accepted Answer · 2011-03-05 12:33:13Z

408

If the SELECT returns more than approximately 5-10% of all rows in the table, a sequential scan is much faster than an index scan.

This is because an index scan requires several IO operations for each row (look up the row in the index, then retrieve the row from the heap). Whereas a sequential scan only requires a single IO for each row - or even less because a block (page) on the disk contains more than one row, so more than one row can be fetched with a single IO operation.

Btw: this is true for other DBMS as well - some optimizations as "index only scans" taken aside (but for a SELECT * it's highly unlikely such a DBMS would go for an "index only scan")

answered Mar 5, 2011 at 12:33

user330315

Sign up to request clarification or add additional context in comments.

9 Comments

Alex Vayda Over a year ago

Interesting, that explains many things for me :) Indeed, when I select by year > 2010 it does index scan. Thank you!

araqnid Over a year ago

Also, a sequential scan can request several pages from the heap at a time, and ask the kernel to be fetching the next chunk while it works on the current one- an index scan fetches one page at once. (A bitmap scan does a compromise between the two, you usually see that appearing in a plan for queries that aren't selective enough for an index scan, but still not so unselective as to merit a full table scan)

Laurent Grégoire Over a year ago

The interesting question is how the database knows how many rows the query will return without doing it first? Does it store stats such as the number of different values vs table size somewhere?

user330315 Over a year ago

@LaurentGrégoire: yes, the database stores statistics about the number of rows and the distribution of values. See the manual for details: postgresql.org/docs/current/static/planner-stats.html

brauliobo Over a year ago

and what about the case where you are sure that the index scan is better? in local db it uses the index and is much faster, on production it prefers seq. scan

|

Frank Heikens · Accepted Answer · 2011-03-05 12:28:20Z

22

Did you ANALYZE the table/database? And what about the statistics? When there are many records where year > 2009, a sequential scan might be faster than an index scan.

answered Mar 5, 2011 at 12:28

Frank Heikens

129k26 gold badges157 silver badges153 bronze badges

Comments

Gaurav Neema · Accepted Answer · 2019-05-01 13:28:51Z

7

In index scan, read head jumps from one row to another which is 1000 times slower than reading the next physical block (in the sequential scan).

So, if the (number of records to be retrieved * 1000) is less than the total number of records, the index scan will perform better.

answered May 1, 2019 at 13:28

Gaurav Neema

3301 gold badge3 silver badges11 bronze badges

Comments

Shitij Goyal · Accepted Answer · 2020-01-31 14:27:41Z

7

@a_horse_with_no_name explained it quite well. Also if you really want to use an index scan, you should generally use bounded ranges in where clause. eg - year > 2019 and year < 2020.

A lot of the times statistics are not updated on a table and it may not be possible to do so due to constraints. In this case, the optimizer will not know how many rows it should take in year > 2019. Thus it selects a sequential scan in lieu of full knowledge. Bounded partitions will solve the problem most of the time.

answered Jan 31, 2020 at 14:27

Shitij Goyal

3112 silver badges8 bronze badges

Collectives™ on Stack Overflow

Why does PostgreSQL perform sequential scan on indexed column?

4 Answers 4

9 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

9 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related