1

I have a MySQL table like this one:

day     int(11) 
hour    int(11)    
amount  int(11)   

Day is an integer with a value that spans from 0 to 365, assume hour is a timestamp and amount is just a simple integer. What I want to do is to select the value of the amount field for a certain group of days (for example from 0 to 10) but I only need the last value of amount available for that day, which pratically is where the hour field has its max value (inside that day). This doesn't sound too hard but the solution I came up with is completely inefficient.

Here it is:

SELECT q.day, q.amount 
    FROM amt_table q 
    WHERE q.day >= 0 AND q.day <= 4 AND q.hour = (
        SELECT MAX(p.hour) FROM amt_table p WHERE p.day = q.day
    ) GROUP BY day

It takes 5 seconds to execute that query on a 11k rows table, and it just takes a span of 5 days; I may need to select a span of en entire month or year so this is not a valid solution.

Anybody who can help me find another solution or optimize this one is really appreciated

EDIT

No indexes are set, but (day, hour, amount) could be a PRIMARY KEY if needed

2
  • 1
    What indexes are set? Irrespective, please post your table CREATE statement (or schema dump) - it'll make it easier for people to make suggestions. Commented Jan 20, 2011 at 23:27
  • Actually no indexes are set since I don't know how to properly set them to help the overall performance. Anyway as I said (day, hour, amount) could be a primary key Commented Jan 20, 2011 at 23:32

2 Answers 2

4

Use:

SELECT a.day, 
       a.amount
  FROM AMT_TABLE a
  JOIN (SELECT t.day,
               MAX(t.hour) AS max_hour
          FROM AMT_TABLE t
      GROUP BY t.day) b ON b.day = a.day
                       AND b.max_hour = a.hour
 WHERE a.day BETWEEN 0 AND 4

I think you're using the GROUP BY a.day just to get a single amount value per day, but it's not reliable because in MySQL, columns not in the GROUP BY are arbitrary -- the value could change. Sadly, MySQL doesn't yet support analytics (ROW_NUMBER, etc) which is what you'd typically use for cases like these.

Look at indexes on the primary keys first, then add indexes on the columns used to join tables together. Composite indexes (more than one column to an index) are an option too.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, this is way faster than my solution. Now I just need to understand what you did :)
@Jack Duluoz: I changed your correlated subquery into a derived table, and used a JOIN to access it. The aggregate is performed once, vs potentially multiple times due to the correlation.
0

I think the problem is the subquery in the where clause. MySQl will at first calculate this "SELECT MAX(p.hour) FROM amt_table p WHERE p.day = q.day" for the whole table and afterwards select the days. Not quite efficient :-)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.