0

Say there's a table description as below:

| spend_daily_level | CREATE TABLE `spend_daily_level` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `active` tinyint(1) NOT NULL,
  `created_time` datetime(6) NOT NULL,
  `updated_time` datetime(6) NOT NULL,
  `date` date NOT NULL,
  `system_value` decimal(16,2) NOT NULL,
  `checked_value` decimal(16,2) NOT NULL,
  `account_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `spend_daily_level_date_account_id_f38b1186_uniq` (`date`,`account_id`),
  KEY `spend_daily_level_account_id_f6df4f99_fk_account_id` (`account_id`),
  CONSTRAINT `spend_daily_level_ibfk_1` FOREIGN KEY (`account_id`) REFERENCES `account` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1DEFAULT CHARSET=utf8 |



+---------------+---------------+------+-----+---------+----------------+
| Field         | Type          | Null | Key | Default | Extra          |
+---------------+---------------+------+-----+---------+----------------+
| id            | int(11)       | NO   | PRI | NULL    | auto_increment |
| active        | tinyint(1)    | NO   |     | NULL    |                |
| created_time  | datetime(6)   | NO   |     | NULL    |                |
| updated_time  | datetime(6)   | NO   |     | NULL    |                |
| date          | date          | NO   | MUL | NULL    |                |
| system_value  | decimal(16,2) | NO   |     | NULL    |                |
| checked_value | decimal(16,2) | NO   |     | NULL    |                |
| account_id    | int(11)       | NO   | MUL | NULL    |                |
+---------------+---------------+------+-----+---------+----------------+


The data in this table is abtained from third party API and saved daily. I want to optimize this table query but partition seems to be not supported with foreign key.

My question is how can I do some optimization in this case as the amount of datas increasing daily?

And there're two main querys I would use:

1

SELECT `spend_daily_level`.`account_id`, 
       `spend_daily_level`.`sale_leader_id`, 
       SUM(`spend_daily_level`.`system_value`) AS `sum_value` 
FROM `spend_daily_level` 
WHERE `spend_daily_level`.`active` = True 
  AND EXTRACT(MONTH FROM `spend_daily_level`.`date`) = 7 
  AND `spend_daily_level`.`date` BETWEEN '2020-01-01' AND '2020-12-31'
GROUP BY `spend_daily_level`.`account_id`, 
         `spend_daily_level`.`sale_leader_id`

2

SELECT sale_leader_id, 
       SUM(s.`system_value`) 
FROM `spend_daily_level` s 
WHERE DATE = "2020-05-29"
GROUP BY sale_leader_id  

Thanks

11
  • 1
    Speaking of performance mainly makes sense in the context of running queries against your table. What are the queries? Commented Sep 21, 2020 at 8:01
  • @TimBiegeleisen Thanks for your reply, I've updated most used sqls above. Commented Sep 21, 2020 at 8:13
  • First query may be optimized by adding an additional column with month number and applying index over [active, month, date]. You can even try to encode month and active into a signle column to get more perfomance. Commented Sep 21, 2020 at 8:21
  • You can try with indexing active column and changing date column to timestamp, but it doesn't must to give you much difference. Commented Sep 21, 2020 at 8:25
  • 1
    Query #1. Replace AND EXTRACT(MONTH FROM `spend_daily_level`.`date`) = 7 AND `spend_daily_level`.`date` BETWEEN '2020-01-01' AND '2020-12-31' with AND `spend_daily_level`.`date` BETWEEN '2020-07-01' AND '2020-07-31'. Test the index by (active, date). Query #2. spend_daily_level_date_account_id_f38b1186_uniq is enough. You may try also the index by (date, sale_leader_id). Commented Sep 21, 2020 at 8:34

1 Answer 1

1

For the second query you'll want

create index idx1 on spend_daily_level (date, sale_leader_id, system_value);

The first query could benefit from an index starting with active or date - whichever is more selective. I would simply provide two indexes and see which one is used and which one is not. (The query would benefit much more from an index on date of course if it where written WHERE date BETWEEN DATE '2020-07-01' AND DATE '2020-07-31' or WHERE date >= DATE '2020-07-01' AND date < DATE '2020-08-01'.)

create index idx2 on spend_daily_level (date, acitve, sale_leader_id, account_id, system_value);
create index idx3 on spend_daily_level (acitve, date, sale_leader_id, account_id, system_value);

Partitions are something rarely needed. You'll want them with billions of rows in the table and working almost only in one partition. I don't think you need them for your database.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.