0

I'm just getting into optimizing queries by logging slow queries and EXPLAINing them. I guess the thing is... I'm not sure exactly what kind of things I should be looking for.... I have the query

SELECT DISTINCT
       screenshot.id,
       screenshot.view_count
  FROM screenshot_udb_affect_assoc
INNER JOIN screenshot ON id = screenshot_id
     WHERE unit_id = 56 
  ORDER BY RAND() 
     LIMIT 0, 6;

Looking at these two elements.... where should I focus on optimization?

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  screenshot  ALL PRIMARY NULL    NULL    NULL    504 Using temporary; Using filesort
1   SIMPLE  screenshot_udb_affect_assoc ref screenshot_id   screenshot_id   8   source_core.screenshot.id,const 3   Using index; Distinct
2
  • What columns are from what table in your query? Commented Nov 7, 2010 at 4:28
  • I added an index on unit_id, it wasn't there and it should've been Commented Nov 7, 2010 at 4:38

3 Answers 3

3

To begin with please refrain using ORDER BY RAND(). This in particular degrades performance when the table size is large. For example, even with limit 1 , it generates number of random numbers equal to the row count, and would pick the smallest one. This might be inefficient if table size is large or bound to grow. Detailed discussion on this can be found at: http://www.titov.net/2005/09/21/do-not-use-order-by-rand-or-how-to-get-random-rows-from-table/

Lastly, also ensure that your join columns are indexed.

Sign up to request clarification or add additional context in comments.

2 Comments

I see his example and it's brilliant, except that I have to make sure the rows meet my specific criteria.....
Under 100K records, it's fine to use ORDER BY RAND() -- over that, and you want to start looking at options that scale better. For more info see this article
1

Try:

  SELECT s.id,
         s.view_count
    FROM SCREENSHOT s
   WHERE EXISTS(SELECT NULL
                  FROM SCREENSHOT_UDB_AFFECT_ASSOC x
                 WHERE x.screenshot_id = s.id)
ORDER BY RAND()
   LIMIT 6

Under 100K records, it's fine to use ORDER BY RAND() -- over that, and you want to start looking at alternatives that scale better. For more info, see this article.

1 Comment

I did that, it yields the same results except it also has Using where; listed in addition to Using temporary; Using filesort
1

I agree with kuriouscoder, refrain from using ORDER BY RAND(), and make sure each of the following fields are indexed in a single index:

screenshot_udb_affect_assoc.id

screenshot.id

screenshot.unit_id

do this using code like:

create index Index1 on screenshot(id):

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.