I have a table call_logs and it contains an id, device_id, timestamp variable along with some other fields. I am currently trying to write a query that returns the last call, if it is working for each device. Currently my query is this:
SELECT DISTINCT ON (device_id) c.device_id, c.timestamp, c.working, c.id
FROM call_logs c
ORDER BY c.device_id, c.timestamp desc;
and it returns the information I want. But my production server is now getting rather large and I have around 6,000,000 records in the table.
I have added an index to this table:
CREATE INDEX cl_device_timestamp
ON public.call_logs USING btree
(device_id, timestamp DESC, id, working)
TABLESPACE pg_default;
But I am getting what I consider to be very slow time: Here is an explain analyse f the query:
EXPLAIN ANALYSE SELECT DISTINCT ON (device_id) c.device_id, c.timestamp, c.working, c.id
FROM call_logs c
ORDER BY c.device_id, c.timestamp desc;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=0.56..363803.37 rows=120 width=25) (actual time=0.069..2171.201 rows=124 loops=1)
-> Index Only Scan using cl_device_timestamp on call_logs c (cost=0.56..347982.87 rows=6328197 width=25) (actual time=0.067..1594.953 rows=6331024 loops=1)
Heap Fetches: 8051
Planning time: 0.184 ms
Execution time: 2171.281 ms
(5 rows)
I only have 124 unique device_id. I would not have thought this would be a slow process with the index? Any ideas what is going wrong? Or why it is so slow?
DISTINCT? If you just want the last call, can't you addLIMIT 1and make theDISTINCTunnecessary?