0

I've noticed a serious problem recently, when my database increased to over 620000 records. Following query:

SELECT *,UNIX_TIMESTAMP(`time`) AS `time` FROM `log` WHERE (`projectname`="test" OR `projectname` IS NULL)  ORDER BY `time` DESC LIMIT 0, 20

has an execution time about 2,5s on a local database. I was wondering how can I speed it up?

The EXPLAIN commands produces following output:

ID: 1
select type: SIMPLE
TABLE: log
type: ref_or_null
possible_keys: projectname
key: projectname
key_len: 387
ref: const
rows: 310661
Extra: Using where; using filesort

I've got indexes set on projectname, time columns.

Any help?

EDIT: Thanks to ypercube response, I was able to decrease query execution time. But when I only add another condition to WHERE clause (AND severity="changes") it lasts 2s again. Is it a good solution to include all of the possible "WHERE" columns to my merged-index?

ID: 1 
select type: SIMPLE 
TABLE: log 
type: ref_or_null 
possible_keys: projectname 
key: projectname 
key_len: 419 
ref: const, const 
rows: 315554 
Extra: Using where; using filesort

Table structure:

   CREATE TABLE `log` (
  `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `projectname` VARCHAR(128) DEFAULT NULL,
  `time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `master` VARCHAR(128) NOT NULL,
  `itemName` VARCHAR(128) NOT NULL,
  `severity` VARCHAR(10) NOT NULL DEFAULT 'info',
  `message` VARCHAR(255) NOT NULL,
  `more` TEXT NOT NULL,
  PRIMARY KEY (`id`),
  KEY `projectname` (`severity`,`projectname`,`time`)
) ENGINE=INNODB AUTO_INCREMENT=621691 DEFAULT CHARSET=utf8
10
  • it has to sort all 620000 records before getting to take the top 20. If you have that index on time, maybe you can add a time > _sometime_ in the where clause to limit the result set before ordering? Commented Nov 27, 2013 at 12:23
  • Instead of * use actual column name which are going to be use .... that will speed up your query... Commented Nov 27, 2013 at 12:24
  • How many rows pass the condition (projectname = 'test' OR projectname IS NULL) ? Commented Nov 27, 2013 at 12:27
  • I appears that projectname of test or null covers most rows on the table. As it appears to be dealing with about half the rows on the table I am a bit surprised MySQL has used the index, but that still leaves it sorting 300k records. Can you narrow the rows down further (ie, maybe looking at a limited time range)? By the way, when you say about the indexes, do you have a covering index on both projectname and time, or just individual indexes on each? Not sure if an index on time helps when you are sorting on a field that is the same name for both the original field and the alias. Commented Nov 27, 2013 at 12:28
  • 1
    Using SELECT * probably won't slow things down if you are using all the columns although it might a bit if you are only using some of the columns. However it is generally a bad idea for maintenance reasons (and in this case you have 2 returned columns which want to have the same name, both called time) Commented Nov 27, 2013 at 12:34

1 Answer 1

1

Add an index on (projectname, time):

ALTER TABLE log
  ADD INDEX projectname_time_IX            -- choose a name for the index
    (projectname, time) ;

And then use the original column for the ORDER BY

SELECT *, UNIX_TIMESTAMP(time) AS unix_time 
FROM log 
WHERE (projectname = 'test' OR projectname IS NULL)  
ORDER BY time DESC 
LIMIT 0, 20 ;

or this variation - to make sure that the index is used effectively:

  ( SELECT *, UNIX_TIMESTAMP(time) AS unix_time 
    FROM log 
    WHERE projectname = 'test'
    ORDER BY time DESC 
    LIMIT 20 
  )
  UNION ALL 
  ( SELECT *, UNIX_TIMESTAMP(time) AS unix_time 
    FROM log 
    WHERE projectname IS NULL
    ORDER BY time DESC 
    LIMIT 20 
  ) 
  ORDER BY time DESC
  LIMIT 20 ;
Sign up to request clarification or add additional context in comments.

14 Comments

When I have one index set to both columns (projectname, time) time was reduced down to 1s
And with the last version? Still in 1 second? Sounds too much for just 20 rows.
Well, that is strange. I check that results in two places and yes, your solution significantly improved query time. First solution takes about 0.039s, second one is better, about 0.023s. But still, on my browser this ajax response runs for ~1s (2,5s in the beginning).
My second query uses the index for sorting the 2 internal subqueries and then the external query has only 40 rows to sort.
Have you added another index on projectname, severity, time, or just amended the existing index? If you have added a new one I would expect it to chose the better index depending on whether you are checking severity or not. If you have just changed the existing index then I would think it would help when severity is specified but would render the index useless when severity is not specified.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.