5

This query was working fine when the database was small, but now that there are millions of rows in the database, I am realizing I should have looked at optimizing this earlier. It is looking at over 600,000 rows and is Using where; Using temporary; Using filesort (which leads to an execution time of 5-10 seconds). It is using an index on the field 'battle_type.'

SELECT username, SUM( outcome ) AS wins, COUNT( * ) - SUM( outcome ) AS losses
FROM tblBattleHistory
WHERE battle_type =  '0' && outcome <  '2'
GROUP BY username
ORDER BY wins DESC , losses ASC , username ASC 
LIMIT 0 , 50
2
  • 2
    Run EXPLAIN on it. It'll tell you where you should tighten things up. Commented May 27, 2010 at 17:20
  • 1
    One thing you could do is cache the results to file or something and re-cache every X minute(s). Commented May 27, 2010 at 17:22

3 Answers 3

5

It appears you need an index on username, battle_type, outcome or username, outcome, battle_type.

Sign up to request clarification or add additional context in comments.

8 Comments

I tried both and then ran EXPLAIN, but it is still just using battle_type as the index. Any idea why that would be? It doesn't show it as a possible key either.
What's the difference in indexing in different orders?
Unfortunatelly, both indexes won't be useful here since grouping happens after the WHERE conditions.
@Kerry: Generally, you specify the columns with the most grainularity first, then the columns with lesser grainularity.
@Gilbert Le Blanc: miss once again =), see my answer below.
|
3

First thing would be to make sure you have good indexes (as others have mentioned).

However, it looks like you're creating a leaderboard of some sort for a web page. My first question would be - do you really need to execute this query in real time? Could you create a table in your database (or add a wins and losses column to the users table) with the results of this query and simply refresh it periodically?

Comments

3

Lets see, what you are doing:

  1. Find rows that are of battle_type = 0 and outcome < 2
  2. Order by username for grouping
  3. Compute aggregations and fold rows to distinct usernames
  4. Order by dinamically calculated fields

On steps 3 and 4 you have no influence. Step 2 in its current form can not benefit from any indices, since outcome < 2 is range condition, index on (battle_type, outcome, username) looks very tempting though.

Assuming that outcome is enumeration of 0,1,2,3... you can change the range condition to equality comparison and benefit from index on (battle_type, outcome, username):

SELECT username, SUM( outcome ) AS wins, COUNT( * ) - SUM( outcome ) AS losses
FROM tblBattleHistory
WHERE battle_type = 0 AND outcome IN (0, 1)
GROUP BY username
ORDER BY wins DESC , losses ASC , username ASC 
LIMIT 0 , 50

If outcome is not enumeration, index on (battle_type, outcome) will do. Index on (battle_type) only is excess now, since battle_type is a prefix in the compound index.

1 Comment

This helped some, but it was still looking at over 500,000 rows, so I'm just going to make a table to cache the results and update every 10 minutes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.