postgresql query slow avoiding using function

Question

https://planchecker.cfapps.io/plan/Edo2MMbv

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON) 
SELECT COUNT(*) AS "__count" 
FROM "juliet" 
WHERE ("juliet"."whiskey" IN ('F') AND "juliet"."three" <= '2001-04-30')

there is an index for this field juliet.three, how can I make sure this query uses the index?

let's say the table has 10N rows and this query returns 3N rows so counting 30% of a big table.

whiskey is an enumeration field which is kept as a charfield without an index. Maybe this is the problem, I'm not sure if the problem is at the charfield or the date field.

the table size is in the order of millions.

Also I got a warning like this: WARNING: Filter using function | Check if function can be avoided

how can I avoid functions? is it possible?

Please run explain with the option format text then edit your question and add the execution plan generated with that option (as formatted text, no screen shots please). option json is for computers to read, not for humans. — user330315
– user330315, Commented May 6, 2019 at 13:29
The warning is pretty stupid assuming that it refers to the count() function - you can't avoid that in that query. An index on (three, whiskey) or (whiskey, three) might help - depending on which of the conditions is responsible for filtering out the majority of rows. — user330315
– user330315, Commented May 6, 2019 at 13:32
The warning (it is stupid, yes) might well refer to the type cast. — Laurenz Albe
– Laurenz Albe, Commented May 6, 2019 at 13:45
Please edit your question to show us the definition of your table, including all indexes. Have you tried creating a compound BTREE index on (whiskey, three)? — O. Jones
– O. Jones, Commented May 6, 2019 at 13:56

Gordon Linoff · Accepted Answer · 2019-05-06 13:55:46Z

2

Postgres has a good optimizer and chooses the most optimized execution plan, based on the information it knows and the rules built-into the database optimizer. For this query, your best index is on juliet(whiskey, three).

This is a covering index for the query, so it does not need to access the data rows. Also, only 30% of the index should need to be scanned.

Without the right index, it does not make sense to force an index scan.

answered May 6, 2019 at 13:55

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Laurenz Albe · Accepted Answer · 2019-05-06 13:47:46Z

1

If the query really returns 30% of the table, then PostgreSQL is probably choosing the fastest access path when it uses a sequential scan.

You can try to

SET enable_seqscan = off;

and then run the query again to see if the index can be used and if the index scan would actually be faster.

answered May 6, 2019 at 13:47

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Comments

The Impaler · Accepted Answer · 2019-05-07 14:06:35Z

1

A query that reads 3 million rows is expected to be slow. I asumme it's for an offline process, since using it for an online app is looking for trouble.

Even though what @LaurenAlbe says is possible, I guess forcing an index usage may actually make your query slower than using a sequential scan.

The only index usage I can see is what @GordonLinoff says: using it for a "covering index".

But... why do you want to use an index in the first place? Any query that reads more than 5% of the table rows is usually efficiently run using a sequential scan.

edited May 7, 2019 at 14:06

answered May 6, 2019 at 13:55

The Impaler

49.3k10 gold badges50 silver badges90 bronze badges

4 Comments

O. Jones Over a year ago

For what it's worth, this query does not read the rows. It counts them. That's entirely different, and can be done very quickly with the correct index.

The Impaler Over a year ago

@O.Jones You are right on that point. The covering index is the only option I can see that could be faster than a heap scan.

EralpB Over a year ago

@O.Jones that's the wisdom I was looking for! I wouldn't have thought there would be a difference. I will research covering index.

Laurenz Albe Over a year ago

I think we agree. I didn't mean that forcing an index scan will make it faster. count(*) is slow, because you have to go through the rows to count them.

Collectives™ on Stack Overflow

postgresql query slow avoiding using function

3 Answers 3

Comments

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related