4

I'm trying to optimize this slow query (>2s)

SELECT COUNT(*)
FROM crmentity c, mdcalls_trans_activity_update mtu, mdcalls_trans mt
WHERE (mtu.dept = 'GUN' OR  mtu.dept = 'gun') AND
      mtu.trans_code = mt.trans_code AND
      mt.activityid = c.crmid AND
      MONTH(mtu.ts) = 2 AND
      YEAR(mtu.ts) = YEAR(NOW()) AND
      c.deleted = 0 AND
      c.smownerid = 28

This is the output when I use EXPLAIN:

id  select_type table   type    possible_keys   key key_len ref rows    Extra   
1   SIMPLE  c   index_merge PRIMARY,crmentity_smownerid_idx,crmentity_deleted_smownerid_idx,crmentity_smownerid_deleted_idx crmentity_smownerid_idx,crmentity_deleted_smownerid_idx 4,8 NULL    91  Using intersect(crmentity_smownerid_idx,crmentity_deleted_smownerid_idx); Using where; Using index
1   SIMPLE  mt  ref activityid  activityid  4   pharex.c.crmid  60  
1   SIMPLE  mtu ref dept_idx    dept_idx    5   const   1530    Using where

It's using the index I created (dept_idx) but it still takes more than 2 seconds to run the query against a dataset of 1,380,384 records. Is there another way of expressing this query in an optimal fashion?

UPDATE: Using the suggestions of David, the query is now down to a few milliseconds instead of it running more than 2 seconds (actually, 51 seconds on version 5.0 of MySQL).

4
  • I would write WHERE lower(mtu.dept) = 'gun' AND ... but I assume your DB will already optimize it in that way. Commented Feb 15, 2010 at 8:59
  • I found, in Oracle at least, using lower on the lhs of the query caused massive slowdown. Whether it causes more slowdown than an additional string compare... Commented Feb 15, 2010 at 9:02
  • 1
    Using lower() on a column is a good way to NOT use any indices. That would explain your slowdowns. Commented Feb 15, 2010 at 9:10
  • Graham, David, you're so right, of course. I don't delete my comment so that the ANTI-pattern is still alive ;-) Commented Feb 15, 2010 at 10:17

5 Answers 5

6

What is the most selective part of the WHERE clause? That is, which condition removes the most potential items from the result set?

I'd guess it's the mtu.ts filter. If that's true, you should also index the mtu.ts column and try to constrain on this in a way that the index can be used; for example by using the BETWEEN operator.

Other tips:

  • Attach join clauses directly to the join with JOIN ... ON (), this makes the query much easier to read, both for humans and the optimizer
  • Avoid calculating constants in the query, like YEAR(NOW())
  • Avoid functions of selected columns in the WHERE clause, like MONTH(mtu.ts). This reduces the possibilities for using indices massively.
  • Normalize your data to avoid casing problems like mtu.dept = 'GUN' OR mtu.dept = 'gun'; a single UPDATE mtu SET dept = lower(dept) and an appropriate CHECK dept = lower(dept) on the table will help avoiding such madness.
Sign up to request clarification or add additional context in comments.

Comments

2
  1. I would rewrite query using joins. It is more clear and give optimizer better chances.
  2. MONTH(mtu.ts) = 2 AND YEAR(mtu.ts) = YEAR(NOW()) - better use mtu.ts between .. and ..

3 Comments

How would you rewrite this? Thanks again.
select count(*) from crmentity c inner join mdcalls_trans mt on mt.activityid = c.crmid inner join mdcalls_trans_activity_update mtu on mtu.trans_code = mt.trans_code where mtu.ts between '20100201' and '20100228' and (mtu.dept in ('GUN', 'gun') and c.deleted = 0 and c.smownerid = 28
Thanks for this example. I created a function in PHP to get the starting date of the month and the end date of the month and used it in the 'BETWEEN' statement.
0

Could you change the text string to a number?

Comments

0

The most obvious solution I can see would be to change COUNT(*) to cover just a single field name, otherwise your index might be next to useless!

Comments

0

As a general principle, a good approach to analysing problems like this is to understand the data your matching on, to appreciate its cardinality.

That is to say, order your query so that the most selective things happen first. What's more likely in your data, that dept = 'GUN' or that the userId would be 28.

Lasty, have you considered joining to MT and MTU instead of filtering ? It might make your query a lot faster as you'll be limiting the amount of data that needs the date comparisons.

1 Comment

Posted too fast, basically what David Schmitt and Burnall are saying!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.