1

I have this table set up in a database with results of games:

Table Players
 id ... name
 1 .... Alice
 2 .... Bob
 3 .... Charlie
  ... etc
Table Games
Player1 Player2 myscore oppscore result
    1 ... 3 .... 25 ... 18 .... W
    3 ... 2 .... 15 ... 20 .... L
    2 ... 1 .... 17 ... 17 .... T

myscore refers to Player1, oppscore refers to Player2

I want a query that returns a player's most frequent opponents, along with the win-loss record between them. (I get the win-loss record with a second query on each opponent.)

So I use this:

SELECT count( * ) p2.name "Opponent", 
FROM games, players p1, players p2
WHERE p1.name = ?
AND games.gametype = ?
AND games.player1 = p1.id
AND games.player2 = p2.id 
GROUP BY player2, gametype
ORDER BY count( * ) DESC

In order to pick up all games (regardless of who is player1 and who is player2) I store every game TWICE: i.e. I really have:

Player1 Player2 myscore oppscore result
    1 ... 3 .... 25 ... 18 .... W
    3 ... 1 .... 18 ... 25 .... L
    3 ... 2 .... 15 ... 20 .... L
    2 ... 3 .... 20 ... 15 .... W
    2 ... 1 .... 17 ... 17 .... T
    1 ... 2 .... 17 ... 17 .... T

I would like to eliminate that redundancy of the data, thereby reducing the database size by half.

I tried this (where g1 is a table like games, but with the redundant rows eliminated).

create view gv as
   select * from g1
union
   select 
   player2 player1,
   player1 player2,
   (case when result = 'T' then 'T'
         when result = 'W' then 'L'
         when result = 'L' then 'W'
           end) result,
   oppscore myscore,
   myscore oppscore
   from g1

And then doing my query against gv instead of against games.

Which works ... except that it takes (based on one example), more than 10 times as long (0.10 seconds for games, vs 1.4 seconds for gv).

Is there a better way to do this?

2
  • can you post the query you're running against that view? Commented Jul 11, 2015 at 3:14
  • It's the same query - just using the view name (gv) instead of the table name (games). Commented Jul 11, 2015 at 4:12

2 Answers 2

1

I think of views as convenience, and unions as slow. Add them together, and you get conveniently slow. Ok, an over generalization.

What performance can you live with?

Denormalized (redundant and flipped in your case) data certainly has its benefits namely speed at the expense of wasted space. It's a juggling act.

One thing about your view is that it does a union of two table scans as there is no filter. This gets worse as you add scores. You utilize no index.

Are you really in need of looking at all data when you could have a stored proc with IN parameters focusing on indexed player id's with a self join or the like?

Indexes can be your best friend with this. Running queries thru mysql explain can help.

Anyway I hope this was helpful in some small way.

Sign up to request clarification or add additional context in comments.

8 Comments

Also two composite indexes (player1,player2) and (player2,player1) could yield good results
I had forgotten about indexes when I created the new non-redundant table for testing, but even after I added them on both (player1, player2) and (player2, player1) the times remained the same.
What index is your query using when it runs
As best as I can tell, the query using the view is doing a table scan, which accounts for the slowness. If I do the UNION query directly, which means that the results are not merged between the two parts, then it is using the indexes and it is fast. I guess the tradeoff then is: to reduce the database size, I can put more effort into processing the queries, ... or just waste the space and keep the code simpler. Thanks for your replies.
|
1

Use union all instead of union in your view. It's much faster as union all will not check for duplicate rows whereas union normally does.

3 Comments

Thanks for the tip - UNION ALL seems to be about 20% faster. Using UNION creating the temporary table used (65,50,55,49,59 - avg 56 ms) and UNION ALL used (45,44,48,44,49 - avg 46 ms) on repeated trials of the exact same query.
For more speed. Do not create a view. Do not pull the results only the scores. Try Filtering on player1 from g1 in your first query and player2 in second query before the union all then apply a case statement to your resulting table. The case statement may have been forcing the full table scan.
Right - I have junked the view, instead I use a TEMPORARY TABLE with only the rows for player1 and with an index on player2 ... it now runs pretty fast. I think it was the View itself that was causing the table scan.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.