SQL Query performance

Question

I have this table set up in a database with results of games:

Table Players
 id ... name
 1 .... Alice
 2 .... Bob
 3 .... Charlie
  ... etc
Table Games
Player1 Player2 myscore oppscore result
    1 ... 3 .... 25 ... 18 .... W
    3 ... 2 .... 15 ... 20 .... L
    2 ... 1 .... 17 ... 17 .... T

myscore refers to Player1, oppscore refers to Player2

I want a query that returns a player's most frequent opponents, along with the win-loss record between them. (I get the win-loss record with a second query on each opponent.)

So I use this:

SELECT count( * ) p2.name "Opponent", 
FROM games, players p1, players p2
WHERE p1.name = ?
AND games.gametype = ?
AND games.player1 = p1.id
AND games.player2 = p2.id 
GROUP BY player2, gametype
ORDER BY count( * ) DESC

In order to pick up all games (regardless of who is player1 and who is player2) I store every game TWICE: i.e. I really have:

Player1 Player2 myscore oppscore result
    1 ... 3 .... 25 ... 18 .... W
    3 ... 1 .... 18 ... 25 .... L
    3 ... 2 .... 15 ... 20 .... L
    2 ... 3 .... 20 ... 15 .... W
    2 ... 1 .... 17 ... 17 .... T
    1 ... 2 .... 17 ... 17 .... T

I would like to eliminate that redundancy of the data, thereby reducing the database size by half.

I tried this (where g1 is a table like games, but with the redundant rows eliminated).

create view gv as
   select * from g1
union
   select 
   player2 player1,
   player1 player2,
   (case when result = 'T' then 'T'
         when result = 'W' then 'L'
         when result = 'L' then 'W'
           end) result,
   oppscore myscore,
   myscore oppscore
   from g1

And then doing my query against gv instead of against games.

Which works ... except that it takes (based on one example), more than 10 times as long (0.10 seconds for games, vs 1.4 seconds for gv).

Is there a better way to do this?

It's the same query - just using the view name (gv) instead of the table name (games). — T G
– T G, Commented Jul 11, 2015 at 4:12

Drew · Accepted Answer · 2015-07-11 03:10:08Z

1

I think of views as convenience, and unions as slow. Add them together, and you get conveniently slow. Ok, an over generalization.

What performance can you live with?

Denormalized (redundant and flipped in your case) data certainly has its benefits namely speed at the expense of wasted space. It's a juggling act.

One thing about your view is that it does a union of two table scans as there is no filter. This gets worse as you add scores. You utilize no index.

Are you really in need of looking at all data when you could have a stored proc with IN parameters focusing on indexed player id's with a self join or the like?

Indexes can be your best friend with this. Running queries thru mysql explain can help.

Anyway I hope this was helpful in some small way.

answered Jul 11, 2015 at 3:10

Drew

25k10 gold badges47 silver badges81 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Drew Over a year ago

Also two composite indexes (player1,player2) and (player2,player1) could yield good results

T G Over a year ago

I had forgotten about indexes when I created the new non-redundant table for testing, but even after I added them on both (player1, player2) and (player2, player1) the times remained the same.

Drew Over a year ago

What index is your query using when it runs

T G Over a year ago

As best as I can tell, the query using the view is doing a table scan, which accounts for the slowness. If I do the UNION query directly, which means that the results are not merged between the two parts, then it is using the indexes and it is fast. I guess the tradeoff then is: to reduce the database size, I can put more effort into processing the queries, ... or just waste the space and keep the code simpler. Thanks for your replies.

Drew Over a year ago

stackoverflow.com/questions/10908064/…

|

Vincent Charette · Accepted Answer · 2015-07-11 21:25:31Z

1

Use union all instead of union in your view. It's much faster as union all will not check for duplicate rows whereas union normally does.

answered Jul 11, 2015 at 21:25

Vincent Charette

1365 bronze badges

3 Comments

T G Over a year ago

Thanks for the tip - UNION ALL seems to be about 20% faster. Using UNION creating the temporary table used (65,50,55,49,59 - avg 56 ms) and UNION ALL used (45,44,48,44,49 - avg 46 ms) on repeated trials of the exact same query.

Vincent Charette Over a year ago

For more speed. Do not create a view. Do not pull the results only the scores. Try Filtering on player1 from g1 in your first query and player2 in second query before the union all then apply a case statement to your resulting table. The case statement may have been forcing the full table scan.

T G Over a year ago

Right - I have junked the view, instead I use a TEMPORARY TABLE with only the rows for player1 and with an index on player2 ... it now runs pretty fast. I think it was the View itself that was causing the table scan.

Collectives™ on Stack Overflow

SQL Query performance

2 Answers 2

8 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related