Help needed optimizing MySQL SELECT query

Question

I have a MySQL table like this one:

day     int(11) 
hour    int(11)    
amount  int(11)

Day is an integer with a value that spans from 0 to 365, assume hour is a timestamp and amount is just a simple integer. What I want to do is to select the value of the amount field for a certain group of days (for example from 0 to 10) but I only need the last value of amount available for that day, which pratically is where the hour field has its max value (inside that day). This doesn't sound too hard but the solution I came up with is completely inefficient.

Here it is:

SELECT q.day, q.amount 
    FROM amt_table q 
    WHERE q.day >= 0 AND q.day <= 4 AND q.hour = (
        SELECT MAX(p.hour) FROM amt_table p WHERE p.day = q.day
    ) GROUP BY day

It takes 5 seconds to execute that query on a 11k rows table, and it just takes a span of 5 days; I may need to select a span of en entire month or year so this is not a valid solution.

Anybody who can help me find another solution or optimize this one is really appreciated

EDIT

No indexes are set, but (day, hour, amount) could be a PRIMARY KEY if needed

What indexes are set? Irrespective, please post your table CREATE statement (or schema dump) - it'll make it easier for people to make suggestions. — John Parker
– John Parker, Commented Jan 20, 2011 at 23:27
Actually no indexes are set since I don't know how to properly set them to help the overall performance. Anyway as I said (day, hour, amount) could be a primary key — Ailef
– Ailef, Commented Jan 20, 2011 at 23:32

OMG Ponies · Accepted Answer · 2011-01-20 23:30:20Z

4

Use:

SELECT a.day, 
       a.amount
  FROM AMT_TABLE a
  JOIN (SELECT t.day,
               MAX(t.hour) AS max_hour
          FROM AMT_TABLE t
      GROUP BY t.day) b ON b.day = a.day
                       AND b.max_hour = a.hour
 WHERE a.day BETWEEN 0 AND 4

I think you're using the GROUP BY a.day just to get a single amount value per day, but it's not reliable because in MySQL, columns not in the GROUP BY are arbitrary -- the value could change. Sadly, MySQL doesn't yet support analytics (ROW_NUMBER, etc) which is what you'd typically use for cases like these.

Look at indexes on the primary keys first, then add indexes on the columns used to join tables together. Composite indexes (more than one column to an index) are an option too.

answered Jan 20, 2011 at 23:30

OMG Ponies

334k85 gold badges536 silver badges508 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Ailef Over a year ago

Thank you, this is way faster than my solution. Now I just need to understand what you did :)

OMG Ponies Over a year ago

@Jack Duluoz: I changed your correlated subquery into a derived table, and used a JOIN to access it. The aggregate is performed once, vs potentially multiple times due to the correlation.

Marko · Accepted Answer · 2011-01-20 23:31:26Z

0

I think the problem is the subquery in the where clause. MySQl will at first calculate this "SELECT MAX(p.hour) FROM amt_table p WHERE p.day = q.day" for the whole table and afterwards select the days. Not quite efficient :-)

answered Jan 20, 2011 at 23:31

Marko

5221 gold badge4 silver badges18 bronze badges

Collectives™ on Stack Overflow

Help needed optimizing MySQL SELECT query

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related