1

I've got a pretty large table with nearly 1 million rows and some of the queries are taking a long time (over a minute).

Here is one that's giving me a particularly hard time...

EXPLAIN ANALYZE SELECT "apps".* FROM "apps" WHERE "apps"."kind" = 'software' ORDER BY itunes_release_date DESC, rating_count DESC LIMIT 12;
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=153823.03..153823.03 rows=12 width=2091) (actual time=162681.166..162681.194 rows=12 loops=1)
   ->  Sort  (cost=153823.03..154234.66 rows=823260 width=2091) (actual time=162681.159..162681.169 rows=12 loops=1)
         Sort Key: itunes_release_date, rating_count
         Sort Method: top-N heapsort  Memory: 48kB
         ->  Seq Scan on apps  (cost=0.00..150048.41 rows=823260 width=2091) (actual time=0.718..161561.149 rows=808554 loops=1)
               Filter: (kind = 'software'::text)
 Total runtime: 162682.143 ms
(7 rows)

So, how would I optimize that? PG version is 9.2.4, FWIW.

There are already indexes on kind and kind, itunes_release_date.

2
  • This doesn't answer your question, but if you have 1 million records, you probably better create an app_kind table with numeric references from apps, rather than repeating varchars such as 'software' all over Commented Jun 3, 2013 at 14:16
  • 1
    @LukasEder: or he could use an enum, to keep existing queries untouched. Commented Jun 3, 2013 at 14:17

3 Answers 3

3

Looks like you're missing an index, e.g. on (kind, itunes_release_date desc, rating_count desc).

Sign up to request clarification or add additional context in comments.

6 Comments

Would an index on kind may be enough? Not sure how much the additional columns will speed up the sort.
An index on kind can be useful but will still yield a top-n sort. To make use of an index to get the top-12 directly, OP will need to add (all of) the sort columns in the index too.
@AngerClown: The plan seems to indicate that 150k rows have kind = 'software', so the index doesn't filter too selectively
@LukasEder It can still help as a part of a composite index.
@LukasEder The index will help to retrieve the limited rows, without sorting the whole table (or the whole 150k rows).
|
0

How big is the apps table? Do you have at least this much memory allocated to postgres? If it's having to read from disk every time, query speed will be much slower.

Another thing that may help is to cluster the table on the 'apps' column. This may speed up disk access since all the software rows will be stored sequentially on disk.

1 Comment

Clustering wont help as the query requires full scan and sort. Postgres memory allocation can help, but not much.
0

The only way to speed up this query is to create a composite index on (itunes_release_date, rating_count). It will allow Postgres to pick first N rows from the index directly.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.