0

I have this postgres query

explain SELECT  "facilities".* FROM "facilities" INNER JOIN 
resource_indices ON resource_indices.resource_id = facilities.uuid WHERE 
(client_id IS NULL OR (client_tag=NULL AND client_id=7)) 
AND (ARRAY['country:india']::varchar[] && resource_indices.tags) 
AND "facilities"."is_listed" = 't'  
ORDER BY resource_indices.name LIMIT 11 OFFSET 100;

Observe the offset. When the offset is less than say 200 it uses index and works fine. The query plan for that is as follow

             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=23416.57..24704.45 rows=11 width=1457) (actual time=41.951..43.035 rows=11 loops=1)
   ->  Nested Loop  (cost=0.71..213202.15 rows=1821 width=1457) (actual time=2.107..43.007 rows=211 loops=1)
         ->  Index Scan using index_resource_indices_on_name on resource_indices  (cost=0.42..190226.95 rows=12460 width=28) (actual time=2.096..40.790 rows=408 loops=1)
               Filter: ('{country:india}'::character varying[] && tags)
               Rows Removed by Filter: 4495
         ->  Index Scan using index_facilities_on_uuid on facilities  (cost=0.29..1.83 rows=1 width=1445) (actual time=0.005..0.005 rows=1 loops=408)
               Index Cond: (uuid = resource_indices.resource_id)
               Filter: ((client_id IS NULL) AND is_listed)
 Planning time: 1.259 ms
 Execution time: 43.121 ms
(10 rows)

Increasing the offset for say four hundred starts using hash join and gives a much poorer performance. Increasing offsets gives much poorer performance.

         QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=34508.62..34508.65 rows=11 width=1457) (actual time=136.288..136.291 rows=11 loops=1)
   ->  Sort  (cost=34507.62..34512.18 rows=1821 width=1457) (actual time=136.224..136.268 rows=411 loops=1)
         Sort Key: resource_indices.name
         Sort Method: top-N heapsort  Memory: 638kB
         ->  Hash Join  (cost=29104.96..34419.46 rows=1821 width=1457) (actual time=23.885..95.099 rows=6518 loops=1)
               Hash Cond: (facilities.uuid = resource_indices.resource_id)
               ->  Seq Scan on facilities  (cost=0.00..4958.39 rows=33790 width=1445) (actual time=0.010..48.732 rows=33711 loops=1)
                     Filter: ((client_id IS NULL) AND is_listed)
                     Rows Removed by Filter: 848
               ->  Hash  (cost=28949.21..28949.21 rows=12460 width=28) (actual time=23.311..23.311 rows=12601 loops=1)
                     Buckets: 2048  Batches: 1  Memory Usage: 814kB
                     ->  Bitmap Heap Scan on resource_indices  (cost=1048.56..28949.21 rows=12460 width=28) (actual time=9.369..18.710 rows=12601 loops=1)
                           Recheck Cond: ('{country:india}'::character varying[] && tags)
                           Heap Blocks: exact=7334
                           ->  Bitmap Index Scan on index_resource_indices_on_tags  (cost=0.00..1045.45 rows=12460 width=0) (actual time=7.680..7.680 rows=13889 loops=1)
                                 Index Cond: ('{country:india}'::character varying[] && tags)
 Planning time: 1.408 ms
 Execution time: 136.465 ms
(18 rows)

How do I resolve this? Thank you

1 Answer 1

1

That is unavoidable, because there is no other way to implement LIMIT 10 OFFSET 10000 but to fetch the first 10010 rows and throw away all but the last 10. This is bound to perform increasingly bad as the offset is raised.

PostgreSQL switches to a different plan because it has to retrieve more result rows, and “fast start” plans that are quick to retrieve the first few rows and usually involve nested loop joins will perform worse than other plans when more result rows are needed.

OFFSET is evil and you should avoid it in most cases. Read what Markus Winand has to say about this topic, particularly how to paginate result sets without OFFSET.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.