1

https://planchecker.cfapps.io/plan/Edo2MMbv

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON) 
SELECT COUNT(*) AS "__count" 
FROM "juliet" 
WHERE ("juliet"."whiskey" IN ('F') AND "juliet"."three" <= '2001-04-30')

there is an index for this field juliet.three, how can I make sure this query uses the index?

let's say the table has 10N rows and this query returns 3N rows so counting 30% of a big table.

whiskey is an enumeration field which is kept as a charfield without an index. Maybe this is the problem, I'm not sure if the problem is at the charfield or the date field.

the table size is in the order of millions.

Also I got a warning like this: WARNING: Filter using function | Check if function can be avoided

how can I avoid functions? is it possible?

5
  • Please run explain with the option format text then edit your question and add the execution plan generated with that option (as formatted text, no screen shots please). option json is for computers to read, not for humans. Commented May 6, 2019 at 13:29
  • The warning is pretty stupid assuming that it refers to the count() function - you can't avoid that in that query. An index on (three, whiskey) or (whiskey, three) might help - depending on which of the conditions is responsible for filtering out the majority of rows. Commented May 6, 2019 at 13:32
  • 2
    What does "10N" mean? 10 million? Commented May 6, 2019 at 13:33
  • The warning (it is stupid, yes) might well refer to the type cast. Commented May 6, 2019 at 13:45
  • 1
    Please edit your question to show us the definition of your table, including all indexes. Have you tried creating a compound BTREE index on (whiskey, three)? Commented May 6, 2019 at 13:56

3 Answers 3

2

Postgres has a good optimizer and chooses the most optimized execution plan, based on the information it knows and the rules built-into the database optimizer. For this query, your best index is on juliet(whiskey, three).

This is a covering index for the query, so it does not need to access the data rows. Also, only 30% of the index should need to be scanned.

Without the right index, it does not make sense to force an index scan.

Sign up to request clarification or add additional context in comments.

Comments

1

If the query really returns 30% of the table, then PostgreSQL is probably choosing the fastest access path when it uses a sequential scan.

You can try to

SET enable_seqscan = off;

and then run the query again to see if the index can be used and if the index scan would actually be faster.

Comments

1

A query that reads 3 million rows is expected to be slow. I asumme it's for an offline process, since using it for an online app is looking for trouble.

Even though what @LaurenAlbe says is possible, I guess forcing an index usage may actually make your query slower than using a sequential scan.

The only index usage I can see is what @GordonLinoff says: using it for a "covering index".

But... why do you want to use an index in the first place? Any query that reads more than 5% of the table rows is usually efficiently run using a sequential scan.

4 Comments

For what it's worth, this query does not read the rows. It counts them. That's entirely different, and can be done very quickly with the correct index.
@O.Jones You are right on that point. The covering index is the only option I can see that could be faster than a heap scan.
@O.Jones that's the wisdom I was looking for! I wouldn't have thought there would be a difference. I will research covering index.
I think we agree. I didn't mean that forcing an index scan will make it faster. count(*) is slow, because you have to go through the rows to count them.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.