2
Below is a location table
    CREATE TABLE locations (
        `id` int(11),
        `user_id` int(11),
        `timestamp` datetime
        PRIMARY KEY (`id`),
        KEY `index_on_timestamp` (`timestamp`),
        KEY `index_on_user_id` (`user_id`)
);

I run the following query to retrieve a given user’s location for the past week:

SELECT * FROM locations 
WHERE user_id=20 
AND timestamp > NOW() - INTERVAL 7 DAY;

We have indexes on both user_id and timestamp, but, this query takes a pretty long time to run. need help fixing this.

2
  • Post the output of the EXPLAIN statement here. That will give us some indication of what the query is doing Commented Mar 21, 2014 at 2:01
  • Sorry wont be able to run Explain on prod DB Commented Mar 21, 2014 at 2:04

2 Answers 2

3

You want an index on both user id and timestamp, in that order:

create index user_locations_user_id_timestamp on user_locations(user_id, timestamp)

This will fully satisfy the where clause.

Sign up to request clarification or add additional context in comments.

2 Comments

need one more help, which of these two queries is better SELECT COUNT() FROM user_locations WHERE YEAR(timestamp)=2013 AND MONTH(timestamp)=11; SELECT COUNT() FROM user_locations WHERE timestamp >= ‘2013-11-01’ AND timestamp < ‘2013-12-01’;
@Raghu . . . The second is much better, in terms of using an index on timestamp. Functions on columns typically prevent an index from being used.
0

You need to look into partitioning your table by date. That should help a tremendous amount. http://dev.mysql.com/doc/refman/5.1/en/partitioning-types.html

This is untested, but try something similar to this:

   CREATE TABLE user_locations (
        `id` int(11),
        `user_id` int(11),
        `state` varchar(255),
        `country` varchar(255),
        `timestamp` datetime
        PRIMARY KEY (`id`),
        KEY `index_on_timestamp` (`timestamp`),
        KEY `index_on_user_id` (`user_id`),
        PARTITION BY KEY(`index_on_timestamp`)
);

To modify your table you would use something like this:

ALTER TABLE user_locations
    PARTITION BY KEY(`index_on_timestamp`)
    PARTITIONS 6;

5 Comments

But im querying data for just 7 days for a specific user.I dont see how partitioning will help the query run faster
The whole point of partitioning is to give a smaller dataset to seek through. If you limit what you must go through, then the query may find what it needs quicker. The query is very time-specific and there are a lot of rows in the database. A perfect candidate for partitioning.
Partitioning may not be the optimal solution, but I don't think the suggestion deserves a downvote. Partitioning could help, but not as much as an appropriate index.
@GordonLinoff: Would both of our answers combined provide the 'best' solution? Proper index AND partitioning? If you think so, would you still suggest partitioning by date, or by the new 'appropriate' index instead?
@CenterOrbit . . . That would depend on several factors. If all queries used only the most recent handful of partitions, then only the indexes for those partitions would need to be loaded into memory. So, if memory availability limits the performance of queries, then this is a good thing, but it might be a negligible effect. In general, though, it sounds like the OP might benefit from partitioning, just not so much for this particular query.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.