1

I am trying to run a query on my MySQL database which is taking 70+ seconds to run, and I'm scratching my head as to why the index isn't being used.

Here's the query:

SELECT PriceId, InstrumentId, Date, Open, High, Low, Close, Volume, UnadjustedClose
FROM price
ORDER BY InstrumentId, Date DESC

The price table has an index with InstrumentId, Date (amongst other indexes). The table itself has 80 million rows, and is made up of 2 ints, a date, a long and 5 decimals.

The explain command has type ALL, Null for possible keys, key and ref, and tells me the system is using filesort.

Is this the best I can get from the system? I expected the index to be used to make the sort faster.

Added:

Here's the table definition:

PriceId int PK, NN, AI
InstrumentId int NN
Date Date NN
Open Decimal(12,4)
High Decimal(12,4)
Low Decimal(12,4)
Close Decimal(12,4)
UnadjustedClose Decimal(12,4)
Volume BigInt

Indexes:

Primary -> PriceId
IX_InstrumentId -> InstrumentId
IX_Date -> Date
IX_InstrumentDate -> InstrumentId, Date

Explain output is:

id: 1
select_type: Simple
table: price
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 77926335
Extra: using filesort
3
  • 1
    you have 80 million rows, that's normal... And your bandwith can have impacts on. The number of cores and the processor of the datacenter have also Commented Jun 6, 2014 at 8:05
  • Can you post the table definition and EXPLAIN output? Commented Jun 6, 2014 at 8:22
  • do you really need to fetch all 80 million rows at once? Commented Jun 6, 2014 at 13:34

2 Answers 2

2

The optimizer will not use the index, because you are retrieving all rows and the index does not contain all columns you are trying to get. This means, the index is not a covering index.

In most cases it is less effective to use the index and lookup for the records based on the index to retrieve the additional columns than to scan the whole table (when you are retrieving everything)

You have some options:

  • Include all the necessary columns in your index: this requires more space and slows down the write operations.
  • Add a filter to the query based on the first column in the index. If the filter is selective enough (shrinks the required amount of rows to a reasonable level), the server will use your index.
  • Filter your data to a reasonable size
  • Do the sorting in the application
  • Modify the primary key (the clustering) to (InstrumentID ASC, Date DESC)

EDIT More about the last option

Your table looks like a log table. In log tables it seems to be a good practice to add a unique integer ID to each records to eliminate duplications (but in most cases it is not). However in most cases you do not use that ID. In MySQL the primary key is the clustering key too (which means the data will be sorted in that order on the disk - more or less, now just forgive the fragmentation.)

In log tables it is a good idea to use the logged entity's ID and a timestamp (InstrumentID, Date in your case) as the clustered index (primary key in MySQL). When you do this, the order of your data will fit to the common business needs, which means the queries performance will be better.

If the InstrumentID and Date is unique (I think it should be, an instrument can not have multiple prices in the same time, and it is really rare to change the price in less than a second), a composite index could be better. (and adds a better option to partition your table than the auto-generated integer values).

Side note: you can change the order of the columns in the PK if you are filtering or sorting by date more frequently than you do by the instrument ID.

END OF EDIT

Some questions you should answer in order to find a better way to achive your goal:

  • Why do you need to retrieve all the 80M records from the table?
  • Does your application really use all of them?
  • If yes, is it possible to do the sorting in application level instead of database level?
  • Really does the order of the records counts?
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks - I think you've hit the nail on the head that my primary key is wrong. I'll make the change and have another go.
0

You can't speed it up because of the large number of rows. Create a Materialized View from this query and once it is created, access will be faster.

MySQL doesn't support Materialized View, you can therefore implement it yourself using tutorial here.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.