0

I have a query:

SELECT * FROM `trades`
WHERE `symbol` = 'ICX/BTC' AND `timestamp` >= :since AND `timestamp` <= :until
ORDER BY `timestamp`
LIMIT 50000

It's take long time to execute (more 5 min). I have index by symbol and timestamp columns

How I can optimize this query?

4
  • 1
    Do you need all the properties of trades? One way of cutting down the execution time is to only select the properties you need. In other words, write SELECT * FROM more specific. Commented May 6, 2018 at 15:11
  • ORDER BY is really the thing slowing it down, ordering by all 50K rows is the bulk of the work in the query Commented May 6, 2018 at 15:18
  • Put it in a (cachable) View. Commented May 6, 2018 at 15:26
  • Please post the table definition (SHOW CREATE TABLE trades) and the EXPLAIN result. This is the minimum of information for SQL performance questions. Commented May 6, 2018 at 16:04

2 Answers 2

1

For this query:

SELECT t.*
FROM trades AS t
WHERE t.symbol = 'ICX/BTC' AND t.timestamp >= :since AND t.timestamp <= :until
ORDER BY t.timestamp
LIMIT 50000;

(which I just rewrite a bit so I can follow it more easily)

You want an index on trades(symbol, timestamp).

However, you appear to be selecting a very large number of rows, so this might still take a long time. The index should be used both for the WHERE clause and the ORDER BY.

Sign up to request clarification or add additional context in comments.

8 Comments

Is the query in this response different from the one in the question? Just asking out of curiosity because they look the same to me except yours is using the abbreviated t ? Also where is t defined in your query? thx
Above 40 millions rows in table. Yes, I try use symbol in ORDER clause
Yes the t table alias executes the same as your query.. But you should make it a habit of using table aliases or fully quantifyed table names within MySQL so MySQL can never mix up the columns if you use co-related subquery or inner queries @Dan
But why are you editing this answer to "fix" it? Seems a bit liberal as an edit, should be up to OP to make that type of edit. Anyway I was mostly just asking initially because I didn't understand the answer fully, trying to get clarification to learn.
For what it's worth Gordon Linoff and Raymond Nijland are both long time participants here. I think it's good they have each others' back.
|
0

In your query, you are retrieving data from just one table and your filter criteria are ...

  1. equality on symbol

  2. range scan low-to-high on timestamp.

Therefore, (as Gordon mentioned) an index on two columns (symbol, timestamp) can satisfy your query, both the filtering and the ordering, quite efficiently. The query planner will do a random access operation on the index to the correct symbol and the starting timestamp, then read the index sequentially until the ending timestamp. That's efficient.

But, your SELECT * may be holding you back on performance. Why? If you used, for example, SELECT symbol, timestamp, cusip, name then you could create a so-called covering index on (symbol, timestamp, cusip, name). In that case, the entire query would be satisfied by scanning the index. That can be very efficient indeed.

Pro tip Avoid SELECT *, both for software stability and performance reasons.

Pro tip Don't add extra indexes to a table unless you know they will help particular queries. MySQL only uses a single index for each table in a query or subquery. Neither an index on just timestamp or just symbol will help much: MySQL still has to examine a lot of rows to satisfy your filtering criteria.

2 Comments

And, a 50K row result set will definitely consume a lot of resources, both on the DBMS and your client program. Can you cut it down to a smaller number?
I try set 5K, 1K, 500 limit - no changes

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.