0

I'm using PostgreSQL 13 and having a performance issue.

I have a post table and an index:

create table post {
id int primary key,
is_deleted boolean default false,
status int default 1, -- status can be 1, 2, 3
commented_at timestamp
}

create index post_deleted_status_commented_id on post(is_deleted, status, commented, id);

When I run the following query, I hoped it would use the above index but actually it did not.

explain analyze
select p.id
from post as p 
where is_deleted = false and (p.status = 1 or p.status = 2)
order by newest desc, id desc limit 20;

Here is the query plan:

Limit  (cost=8767.81..8770.14 rows=20 width=12) (actual time=61.289..61.345 rows=20 loops=1)
  ->  Gather Merge  (cost=8767.81..18738.60 rows=85458 width=12) (actual time=61.287..61.341 rows=20 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Sort  (cost=7767.78..7874.61 rows=42729 width=12) (actual time=54.776..54.778 rows=19 loops=3)
              Sort Key: newest DESC, id DESC
              Sort Method: top-N heapsort  Memory: 26kB
              Worker 0:  Sort Method: top-N heapsort  Memory: 26kB
              Worker 1:  Sort Method: top-N heapsort  Memory: 26kB
              ->  Parallel Seq Scan on post p  (cost=0.00..6630.78 rows=42729 width=12) (actual time=0.011..40.473 rows=39998 loops=3)
                    Filter: ((NOT is_deleted) AND ((status = 1) OR (status = 2)))
                    Rows Removed by Filter: 8848
Planning Time: 0.149 ms
Execution Time: 61.370 ms

The post table has 146538 rows (120770 rows with is_deleted = false). If I just use p.status = 1, the index will be used. So is anyway to optimize the query? Thanks.

Update 1: Partial index does not help.

The newest field is updated frequently (when there is a new comment).

3
  • Which parts of this query change from execution to execution, and which parts are always the same? Commented Jun 18, 2021 at 19:00
  • I'm not sure what you mean. The only thing will be changed is offset (for pagination), so I will add offset 20, offset 40..., the rest parts will be the same. Commented Jun 19, 2021 at 0:25
  • Maybe the optimiser decided that when filtering out only rows that are deleted or status=3, most of the rows in the table need to be loaded anyway to sort them, and the index scan doesn't provide enough of an advantage? Commented Jun 19, 2021 at 2:19

1 Answer 1

1

If the query is always exactly that, then the optimal index would be a filtered/partial index which matches the WHERE clause and is indexed compatible with the ORDER BY clause.

create index on post (newest,id) WHERE
   is_deleted = false and (status = 1 or status = 2);

Now your query gives:

    QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..1.03 rows=20 width=12) (actual time=0.013..0.021 rows=20 loops=1)
   ->  Index Only Scan Backward using post_newest_id_idx on post p  (cost=0.42..2040.81 rows=66693 width=12) (actual time=0.012..0.017 rows=20 loops=1)
         Heap Fetches: 0
 Planning Time: 0.121 ms
 Execution Time: 0.039 ms
Sign up to request clarification or add additional context in comments.

2 Comments

I'm 100% sure that partial index does not work because I tried it before.
I tried it before posting, and it did work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.