5

Given the following SQL table :

Employee(ssn, name, dept, manager, salary)

You discover that the following query is significantly slower than expected. There is an index on salary, and you have verified that the query plan is using it.

SELECT * 
FROM Employee
WHERE salary = 48000

Please give a possible reason why this query is slower than expected, and provide a tuning solution that addresses that reason.

I have two ideas for why this query is slower than expected. One is that we are trying to SELECT * instead of SELECT Employee.salary which would slow down the query as we must search across all columns instead of one. Another idea is that the index on salary is non-clustered, and we want to use a clustered index, as the company could be very large and it would make sense to organize the table by the salary field.

Would either of those two solutions speed up this query? I.e. either change SELECT * to SELECT Employee.salary or explicitly set the index on salary to be clustered?

3
  • 1
    Given that the index is being used, my first thought would be contention on the table. Commented Apr 22, 2017 at 20:39
  • 1
    What if 48000 is the entry-level salary at some hypothetical company that has a million employees? That could return a staggering number of rows. A "Select universe from..." type query is always concerning. A reasonable LIMIT should always be applied. Commented Apr 22, 2017 at 21:59
  • 1
    What @tadman said. What if 99% of one million employees have salary = 48000. Commented Apr 23, 2017 at 9:45

2 Answers 2

4

What indexes do you have now?

Is it really "slow"? What evidence do you have?

Comments on "SELECT * instead of SELECT Employee.salary" --

  • * is bad form because tomorrow you might add a column, thereby breaking any code that is expecting a certain number of columns in a certain order.
  • Dealing with * versus salary does not happen until after the row(s) is located.
  • Locating the row(s) is the costly part.
  • On the other hand, if you have INDEX(salary) and only look at salary then the index is "covering". That means that the "data" (the other columns) does not need to be fetched. Hence, faster. But this is probably beyond what your teacher has told you about yet.

Comments on "the index on salary is non-clustered, and we want to use a clustered index" --

  • In MySQL (not necessarily in other RDBMSs), InnoDB has exactly one PRIMARY KEY and it is always UNIQUE and "clustered".
  • That is, "clustered" implies "unique", which seems inappropriate for "salary".
  • In InnoDB a "secondary key" implicitly includes the column(s) of the PK (ssn?), with which it can reach over into the data.

"verified that the query plan" -- Have you learned about EXPLAIN SELECT ...?

More Tips on creating the optimal index for a given SELECT.

Sign up to request clarification or add additional context in comments.

Comments

0

I will try to be as simple as I can be ,

You can not simply make salary a clustered index unless you make it a unique or primary which is kind of both stupid and senseless because two person can have same salary.

There can be only one clustered index per table according to MYSQL documentation. Database by default elects primary key for being clustered index .

If you do not define a PRIMARY KEY for your table, MySQL locates the first UNIQUE index where all the key columns are NOT NULL and InnoDB uses it as the clustered index.

To speed up your query I have a few suggestions , go for secondary indexes,

If you want to search a salary by direct value then hash based indexes are a better option, if MYSQL supports that already.

If you want to search a value using greater than , less than or some range ,then B-tree indexes are better choice.

The first option is faster than the second one , but is limited to only equality operator.

Hope it helps.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.