PostgreSQL SELECT too slow

Question

I am looking for an idea to optimize my query.

Currently, I have a table of 4M lines, I only want to retrieve the last 1000 lines of a reference:

SELECT * 
FROM customers_material_events 
WHERE reference = 'XXXXXX' 
ORDER BY date DESC 
LIMIT 1000;

This is the execution plan:

Limit  (cost=12512155.48..12512272.15 rows=1000 width=6807) (actual time=8953.545..9013.658 rows=1000 loops=1)
   Buffers: shared hit=16153 read=30342
   ->  Gather Merge  (cost=12512155.48..12840015.90 rows=2810036 width=6807) (actual time=8953.543..9013.613 rows=1000 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=16153 read=30342
         ->  Sort  (cost=12511155.46..12514668.00 rows=1405018 width=6807) (actual time=8865.186..8865.208 rows=632 loops=3)
               Sort Key: date DESC
               Sort Method: top-N heapsort  Memory: 330kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 328kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 330kB
               Buffers: shared hit=16153 read=30342
               ->  Parallel Seq Scan on customers_material_events  (cost=0.00..64165.96 rows=1405018 width=6807) (actual time=0.064..944.029 rows=1117807 loops=3)
                     Filter: ((reference)::text = 'FFFEEE'::text)
                     Rows Removed by Filter: 17188
                     Buffers: shared hit=16091 read=30342
 Planning Time: 0.189 ms
 Execution Time: 9013.834 ms
(18 rows)

I see the execution time is very very slow...

Then you’ll probably benefit from adding an index on reference if the data is suitable, which it seems to be — Sami Kuhmonen
– Sami Kuhmonen, Commented Feb 26, 2019 at 10:35
Ideally the index should be a multicolumn (reference, date) one to search and sort on. PostgreSQL would still need to access the table data for the other column data — Raymond Nijland
– Raymond Nijland, Commented Feb 26, 2019 at 10:38
ORDER BY a_column DESC LIMIT N quite often benefits from an index on a_column. As mentioned above, you can also add an index reference and (reference, date) or (date, reference). Just experiment with adding an index, doing ANALYZE customers_material_events and measure the speed -- one of these indexes can also speed up the query, but if and how much really depends on the selectivity of both columns. — Mikhail Puzanov
– Mikhail Puzanov, Commented Feb 26, 2019 at 11:01

Laurenz Albe · Accepted Answer · 2019-02-26 11:15:31Z

2

The ideal index for this query would be:

CREATE INDEX ON customers_material_events (reference, date);

That would allow you to quickly find the values for a certain reference, automatically ordered by date, so no extra sort step is necessary.

answered Feb 26, 2019 at 11:15

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

PostgreSQL SELECT too slow

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related