This is the schema of task_statuses table
Table "public.task_statuses"
Column | Type | Collation | Nullable | Default
----------------+-----------------------------+-----------+----------+---------
id | uuid | | not null |
updated_at | timestamp without time zone | | not null |
status | task_status | | not null |
status_details | json | | |
Indexes:
"task_statuses_id_updated_at_key" UNIQUE CONSTRAINT, btree (id, updated_at)
"idx_task_statuses_id_updated_at" btree (id, updated_at)
Foreign-key constraints:
"task_statuses_id_fkey" FOREIGN KEY (id) REFERENCES tasks(id) ON DELETE CASCADE
The table however is huge and has 150GB of data in production. I am trying to run an extremely simple query
SELECT
ts.id
FROM task_statuses ts
WHERE
ts.status IN ('succeeded', 'failed', 'cancelled')
ORDER BY ts.id, ts.updated_at desc LIMIT 1000
It keeps timing out in production. When I remove ORDER BY the query runs successfully. Since, I have index in id and udpated_at, I am not sure why order by is timing out.
explain analyse times out as well.
Here is the explain for the above query.
Limit (cost=10651159.84..10651276.51 rows=1000 width=24)
-> Gather Merge (cost=10651159.84..10744721.60 rows=801902 width=24)
Workers Planned: 2
-> Sort (cost=10650159.81..10651162.19 rows=400951 width=24)
Sort Key: id, updated_at DESC
-> Parallel Seq Scan on task_statuses ts (cost=0.00..10628176.10 rows=400951 width=24)
Filter: (status = ANY ('{succeeded,failed,cancelled}'::task_status[]))
Query plan without order by:
Helpful links:
Suggestions or help would be much appreciated.