0

I have created an application to track progress in League of Legends for me and my friends. For this purpose, I collect information about the current rank several times a day into my MySQL database. To fetch the results and show the to them in the graph, I use the following query / queries:

SELECT 
    lol_summoner.name as name, grid.series + ? as timestamp, 
    AVG(NULLIF(lol.points, 0)) as points
FROM 
    series_tmp grid
JOIN 
    lol ON lol.timestamp >= grid.series AND lol.timestamp < grid.series + ?
JOIN 
    lol_summoner ON lol.summoner = lol_summoner.id
GROUP BY
    lol_summoner.name, grid.series
ORDER BY
    name, timestamp ASC

SELECT 
    lol_summoner.name as name, grid.series + ? as timestamp, 
    AVG(NULLIF(lol.points, 0)) as points
FROM 
    series_tmp grid
JOIN 
    lol ON lol.timestamp >= grid.series AND lol.timestamp < grid.series + ?
JOIN 
    lol_summoner ON lol.summoner = lol_summoner.id
WHERE 
    lol_summoner.name IN (". str_repeat('?, ', count($names) - 1) ."?)
GROUP BY
    lol_summoner.name, grid.series
ORDER BY
    name, timestamp ASC

The first query is used in case I want to retrieve all players which are saved in the database. The grid table is a temporary table which generated timestamps in a specific interval to retrive information in chunks of this interval. The two variable in this query are the interval. The second query is used if I want to retrieve information for specific players only.

The grid table is produces by the following stored procedure which is called with three parameters (n_first - first timestamp, n_last - last timestamp, n_increments - increments between two timestamps):

BEGIN
    -- Create tmp table
    DROP TEMPORARY TABLE IF EXISTS series_tmp;
    CREATE TEMPORARY TABLE series_tmp (
        series bigint
    ) engine = memory;

    WHILE n_first <= n_last DO
        -- Insert in tmp table
        INSERT INTO series_tmp (series) VALUES (n_first);

        -- Increment value by one
        SET n_first = n_first + n_increment; 
    END WHILE;
END

The query works and finishes in reasonable time (~10 seconds) but I am thankful for any help to improve the query by either rewriting it or adding additional indexes to the database.

/Edit:

After review of @Rick James answer, I modified the queries as follows:

SELECT lol_summoner.name as name, (lol.timestamp div :range) * :range + :half_range as timestamp, AVG(NULLIF(lol.points, 0)) as points
  FROM lol
    JOIN lol_summoner ON lol.summoner = lol_summoner.id
  GROUP by lol_summoner.name, lol.timestamp div :range
  ORDER by name, timestamp ASC

SELECT lol_summoner.name as name, (lol.timestamp div :range) * :range + :half_range as timestamp, AVG(NULLIF(lol.points, 0)) as points
  FROM lol
    JOIN lol_summoner ON lol.summoner = lol_summoner.id
  WHERE lol_summoner.name IN (<NAMES>)
  GROUP by lol_summoner.name, lol.timestamp div " . $steps . "
  ORDER by name, timestamp ASC

This improves the query execution time by a really good margin (finished way under 1s).

4
  • Please add your definition of the grid object, since it sounds like it's a query or view rather than just a table holding data. If it really is just a temp table, add the schema and show how you populate it. Commented Jan 7, 2016 at 15:00
  • @EsotericScreenName: added information about this table Commented Jan 7, 2016 at 15:12
  • In your query, what value goes into the parameter where you say "grid.series + ?" Is that the same as the increment you are using when you generate the grid table? If it is, it might be better to calculate the second number and save that to the grid table itself so that you're not using the expression in the other query. Also, if the grid table is largish, it's probably worth adding an index to it. Can't really hurt. Commented Jan 7, 2016 at 15:30
  • The value is a half increment to get a timestamp which represents the whole range. Problem with adding an index is that the table is created on demand which means building an index takes time and should not be faster in this case? Commented Jan 7, 2016 at 15:36

1 Answer 1

1

Problem 1 and Solution

You need a series of integers between two values? And they differ by 1? Or by some larger value?

First, create a permanent table of the numbers from 0 to some large enough value:

CREATE TABLE Num10 ( n INT );
INSERT INTO Num10 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
CREATE TABLE Nums ( n INT, PRIMARY KEY(n))
    SELECT a.n*1000 + b.n*100 + c.n*10 + d.n
        FROM Num10 AS a
        JOIN Num10 AS b  -- note "cross join"
        JOIN Num10 AS c
        JOIN Num10 AS d;

Now Nums has 0..9999. (Make it bigger if you might need more.)

To get a sequence of consecutive numbers from 123 through 234:

 SELECT 123 + n FROM Nums WHERE n < 234-123+1;

To get a sequence of consecutive numbers from 12345 through 23456, in steps of 15:

 SELECT 12345 + 15*n FROM Nums WHERE n < (23456-12345+1)/15;

JOIN to a SELECT like one of those instead of to series_tmp.

Barring other issue, that should significantly speed things up.

Problem 2

You are GROUPing BY series, but ORDERing by timestamp. They are related, so you might get the 'right' answer. But think about it.

Problem 3

You seem to be building "buckets" (called "series"?) from "timestamps". Is this correct? If so, let's work backwards -- Turn a "timestamp" into a "bucket" number:

bucket_number = (timestamp - start) / bucket_size

By doing that throughout, you can avoid 'Problem 1' and eliminate my solution to it. That is, reformulate the entire queries in terms of buckets.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your answer. I changed the query according to your third suggested solution (see above). Is 2 still a problem?
Probably works ok as you have posted in your Edit. Test it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.