How to optimize MySQL query for a large database

Question

I've noticed a serious problem recently, when my database increased to over 620000 records. Following query:

SELECT *,UNIX_TIMESTAMP(`time`) AS `time` FROM `log` WHERE (`projectname`="test" OR `projectname` IS NULL)  ORDER BY `time` DESC LIMIT 0, 20

has an execution time about 2,5s on a local database. I was wondering how can I speed it up?

The EXPLAIN commands produces following output:

ID: 1
select type: SIMPLE
TABLE: log
type: ref_or_null
possible_keys: projectname
key: projectname
key_len: 387
ref: const
rows: 310661
Extra: Using where; using filesort

I've got indexes set on projectname, time columns.

Any help?

EDIT: Thanks to ypercube response, I was able to decrease query execution time. But when I only add another condition to WHERE clause (AND severity="changes") it lasts 2s again. Is it a good solution to include all of the possible "WHERE" columns to my merged-index?

ID: 1 
select type: SIMPLE 
TABLE: log 
type: ref_or_null 
possible_keys: projectname 
key: projectname 
key_len: 419 
ref: const, const 
rows: 315554 
Extra: Using where; using filesort

Table structure:

   CREATE TABLE `log` (
  `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `projectname` VARCHAR(128) DEFAULT NULL,
  `time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `master` VARCHAR(128) NOT NULL,
  `itemName` VARCHAR(128) NOT NULL,
  `severity` VARCHAR(10) NOT NULL DEFAULT 'info',
  `message` VARCHAR(255) NOT NULL,
  `more` TEXT NOT NULL,
  PRIMARY KEY (`id`),
  KEY `projectname` (`severity`,`projectname`,`time`)
) ENGINE=INNODB AUTO_INCREMENT=621691 DEFAULT CHARSET=utf8

it has to sort all 620000 records before getting to take the top 20. If you have that index on time, maybe you can add a time > _sometime_ in the where clause to limit the result set before ordering? — oerkelens
– oerkelens, Commented Nov 27, 2013 at 12:23
Instead of * use actual column name which are going to be use .... that will speed up your query... — The Hungry Dictator
– The Hungry Dictator, Commented Nov 27, 2013 at 12:24
How many rows pass the condition (projectname = 'test' OR projectname IS NULL) ? — ypercubeᵀᴹ
– ypercubeᵀᴹ, Commented Nov 27, 2013 at 12:27
I appears that projectname of test or null covers most rows on the table. As it appears to be dealing with about half the rows on the table I am a bit surprised MySQL has used the index, but that still leaves it sorting 300k records. Can you narrow the rows down further (ie, maybe looking at a limited time range)? By the way, when you say about the indexes, do you have a covering index on both projectname and time, or just individual indexes on each? Not sure if an index on time helps when you are sorting on a field that is the same name for both the original field and the alias. — Kickstart
– Kickstart, Commented Nov 27, 2013 at 12:28
Using SELECT * probably won't slow things down if you are using all the columns although it might a bit if you are only using some of the columns. However it is generally a bad idea for maintenance reasons (and in this case you have 2 returned columns which want to have the same name, both called time) — Kickstart
– Kickstart, Commented Nov 27, 2013 at 12:34

ypercubeᵀᴹ · Accepted Answer · 2013-11-27 12:59:06Z

1

Add an index on (projectname, time):

ALTER TABLE log
  ADD INDEX projectname_time_IX            -- choose a name for the index
    (projectname, time) ;

And then use the original column for the ORDER BY

SELECT *, UNIX_TIMESTAMP(time) AS unix_time 
FROM log 
WHERE (projectname = 'test' OR projectname IS NULL)  
ORDER BY time DESC 
LIMIT 0, 20 ;

or this variation - to make sure that the index is used effectively:

  ( SELECT *, UNIX_TIMESTAMP(time) AS unix_time 
    FROM log 
    WHERE projectname = 'test'
    ORDER BY time DESC 
    LIMIT 20 
  )
  UNION ALL 
  ( SELECT *, UNIX_TIMESTAMP(time) AS unix_time 
    FROM log 
    WHERE projectname IS NULL
    ORDER BY time DESC 
    LIMIT 20 
  ) 
  ORDER BY time DESC
  LIMIT 20 ;

edited Nov 27, 2013 at 12:59

answered Nov 27, 2013 at 12:24

ypercubeᵀᴹ

116k19 gold badges181 silver badges249 bronze badges

Sign up to request clarification or add additional context in comments.

14 Comments

Flasz Over a year ago

When I have one index set to both columns (projectname, time) time was reduced down to 1s

ypercubeᵀᴹ Over a year ago

And with the last version? Still in 1 second? Sounds too much for just 20 rows.

Flasz Over a year ago

Well, that is strange. I check that results in two places and yes, your solution significantly improved query time. First solution takes about 0.039s, second one is better, about 0.023s. But still, on my browser this ajax response runs for ~1s (2,5s in the beginning).

ypercubeᵀᴹ Over a year ago

My second query uses the index for sorting the 2 internal subqueries and then the external query has only 40 rows to sort.

Kickstart Over a year ago

Have you added another index on projectname, severity, time, or just amended the existing index? If you have added a new one I would expect it to chose the better index depending on whether you are checking severity or not. If you have just changed the existing index then I would think it would help when severity is specified but would render the index useless when severity is not specified.

|

Collectives™ on Stack Overflow

How to optimize MySQL query for a large database

1 Answer 1

14 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

14 Comments

Your Answer

Sign up or log in

Post as a guest

Related