0

I've got a table with close to 7 million rows in it. Here's the table structure

 `CREATE TABLE `ERS_SALES_TRANSACTIONS` (
  `saleId` int(12) NOT NULL AUTO_INCREMENT,
  `ERS_COMPANY_CODE` int(3) DEFAULT NULL,
  `SALE_SECTION` varchar(128) DEFAULT NULL,
  `SALE_DATE` date DEFAULT NULL,
  `SALE_STOCKAGE_EXACT` int(4) DEFAULT NULL,
  `SALE_NET_AMOUNT` decimal(11,2) DEFAULT NULL, 
  `SALE_ABSOLUTE_CDATE` date DEFAULT NULL,
  PRIMARY KEY (`saleId`),
  KEY `index_location` (`ERS_COMPANY_CODE`),
  KEY `idx-erscode-salesec` (`SALE_SECTION`,`ERS_COMPANY_CODE`) USING BTREE,
  KEY `idx-saledate-section` (`SALE_DATE`,`SALE_SECTION`) USING BTREE
  KEY `idx_quick_sales_transactions` (`ERS_COMPANY_CODE`,`SALE_SECTION`,`SALE_DATE`,`SALE_STOCKAGE_EXACT`,`SALE_NET_AMOUNT`)
) ENGINE=InnoDB;

This query is taking more than 7 secs to execute, is there any way to speed this up?

SELECT 
   A.SALE_SECTION,  
   SUM(IF(A.SALE_DATE BETWEEN '2016-01-16' AND '2016-04-30'
          AND A.SALE_STOCKAGE_EXACT BETWEEN 0 AND 90, A.SALE_NET_AMOUNT, 0)) AS fs1_pd1_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-01-16' AND '2016-04-30'
          AND A.SALE_STOCKAGE_EXACT BETWEEN 91 AND 180, A.SALE_NET_AMOUNT, 0)) AS fs2_pd1_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-01-16' AND '2016-04-30'
          AND A.SALE_STOCKAGE_EXACT BETWEEN 181 AND 365, A.SALE_NET_AMOUNT, 0)) AS os1_pd1_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-01-16' AND '2016-04-30'
          AND A.SALE_STOCKAGE_EXACT BETWEEN 366 AND 9999, A.SALE_NET_AMOUNT, 0)) AS os2_pd1_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-01-16' AND '2016-04-30', A.SALE_NET_AMOUNT, 0)) AS TOTAL_PD1_SALE,
   SUM(IF(A.SALE_DATE BETWEEN '2016-04-01' AND '2016-04-30'
          AND A.SALE_STOCKAGE_EXACT BETWEEN 0 AND 90, A.SALE_NET_AMOUNT, 0)) AS fs1_pd2_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-04-01' AND '2016-04-30'
          AND A.SALE_STOCKAGE_EXACT BETWEEN 91 AND 180, A.SALE_NET_AMOUNT, 0)) AS fs2_pd2_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-04-01' AND '2016-04-30'
          AND A.SALE_STOCKAGE_EXACT BETWEEN 181 AND 365, A.SALE_NET_AMOUNT, 0)) AS os1_pd2_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-04-01' AND '2016-04-30'
          AND A.SALE_STOCKAGE_EXACT BETWEEN 366 AND 9999, A.SALE_NET_AMOUNT, 0)) AS os2_pd2_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-04-01' AND '2016-04-30', A.SALE_NET_AMOUNT, 0)) AS TOTAL_PD2_SALE,
   SUM(IF(A.SALE_DATE BETWEEN '2016-05-01' AND '2016-05-31'
          AND A.SALE_ABSOLUTE_CDATE BETWEEN '2016-03-01' AND '2016-05-31', A.SALE_NET_AMOUNT, 0)) AS fs1_achived_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-05-01' AND '2016-05-31'
          AND A.SALE_ABSOLUTE_CDATE BETWEEN '2015-12-01' AND '2016-02-29', A.SALE_NET_AMOUNT, 0)) AS fs2_achived_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-05-01' AND '2016-05-31'
          AND A.SALE_ABSOLUTE_CDATE BETWEEN '2015-06-01' AND '2015-11-30', A.SALE_NET_AMOUNT, 0)) AS os1_achived_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-05-01' AND '2016-05-31'
          AND A.SALE_ABSOLUTE_CDATE BETWEEN '2006-12-26' AND '2015-05-31', A.SALE_NET_AMOUNT, 0)) AS os2_achived_sale,
   SUM(IF(A.SALE_DATE BETWEEN '2016-05-01' AND '2016-05-31', A.SALE_NET_AMOUNT, 0)) AS Total_ACHIVED_SALE
   FROM ERS_SALES_TRANSACTIONS A WHERE A.ERS_COMPANY_CODE = 48 GROUP BY A.SALE_SECTION

Here's Explain query

{
"data":
[
    {
        "id": 1,
        "select_type": "SIMPLE",
        "table": "A",
        "type": "ref",
        "possible_keys": "index_location,idx-erscode-salesec,idx-saledate-section",
        "key": "index_location",
        "key_len": "5",
        "ref": "const",
        "rows": 1411944,
        "Extra": "Using where; Using temporary; Using filesort"
    }
]
}

After adding composite index, time decreased to 4.03 sec. Here' the plan

{
"data":
[
    {
        "id": 1,
        "select_type": "SIMPLE",
        "table": "A",
        "type": "ref",
        "possible_keys": "index_location,idx-erscode-salesec,idx-saledate-section,idx_quick_sales_transactions",
        "key_len": "5",
        "key": "idx_quick_sales_transactions",
        "ref": "const",
        "rows": 1306058,
        "Extra": "Using where"
    }
]

}

2
  • Try and get rid of all those SUM(IF(..., try (outer) self joins instead. Commented May 10, 2016 at 10:12
  • key index_location is the index for ERS_COMPANY_CODE Commented May 10, 2016 at 10:20

3 Answers 3

2

I don't know if there is a way to speed this up. But, you can try using an index. I would recommend one on ERS_SALES_TRANSACTIONS(ERS_COMPANY_CODE, SALE_SECTION, SALE_DATE, SALE_NET_AMOUNT).

This is a covering index for the query, meaning that all columns used for the query are in the index -- and hence the data base engine does not need to access the original data pages.

However, the performance still depends on the number of rows that match the particular company code. And, in particular, the performance of the file sort used for aggregation.

Sign up to request clarification or add additional context in comments.

4 Comments

Okay, after adding the composite index query time decreased to 4.03 sec but its still a lot
Do you have a table with one row per ERS_COMPANY_CODE and SALE_SECTION ?
How do you guys figured the column order for composite index? if i need to do this for other queries
WHERE A.ERS_COMPANY_CODE = 48 GROUP BY A.SALE_SECTION -- do the WHERE = column(s) first, then range(s), then GROUP BY. See cookbook.
1

I don't agree with Jimmy B here. Your query looks perfect in my opinion.

Depending on how many records there are for company 48 either the full table should be read sequentially (when it's many, say, 50% of all table records) or an index on ERS_COMPANY_CODE should be used (when it's not that many, say, only 1% of all records).

As the DBMS decided to use the index on ERS_COMPANY_CODE, the latter should be the case.

You can try to further speed up the query by creating a composed index. Make that at least (ERS_COMPANY_CODE , SALE_SECTION), so as to have the GROUP BY quicker. Better even add all fields, so all data can be gathered from the index and the table itself doesn't have to be accessed any more.

CREATE INDEX idx_quick_sales_transactions ON ERS_SALES_TRANSACTIONS
  (ERS_COMPANY_CODE, SALE_SECTION, SALE_DATE, SALE_STOCKAGE_EXACT, SALE_NET_AMOUNT);

8 Comments

How to figure out the column order for composite index? if i need to do this for other queries
I always use columns in the order of 1. columns in WHERE clause, 2. columns in GROUP BY clause, 3. columns in HAVING clause, 4. columns in SELECT clause. So retrieving the data has first priority, aggregating second, and displaying results last.
Alright, i'll make a note of it. Thanks for the help
And I just notice that I neglected column SALE_ABSOLUTE_CDATE which is also in the SELECT clause. Add this to the index, so the index is completely covering the query data. (It's a very big index, but well, it gets access as fast as possible.)
okay but is there any order that i need to follow for SELECT clause columns?
|
0
SELECT 
   sales.SALE_SECTION,
   SUM( fs1_pd1.SALE_NET_AMOUNT ) AS fs1_pd1_sale,
   SUM( fs2_pd1.SALE_NET_AMOUNT ) AS fs2_pd1_sale,
...
FROM ERS_SALES_TRANSACTIONS sales

LEFT OUTER JOIN ERS_SALES_TRANSACTIONS fs1_pd1 ON sales.ERS_COMPANY_CODE = fs1_pd1.ERS_COMPANY_CODE AND sales.SALE_SECTION = fs1_pd1.SALE_SECTION
  AND fs1_pd1.SALE_DATE BETWEEN '2016-01-16' AND '2016-04-30'
  AND fs1_pd1.SALE_STOCKAGE_EXACT BETWEEN 0 AND 90

LEFT OUTER JOIN ERS_SALES_TRANSACTIONS fs2_pd1 ON sales.ERS_COMPANY_CODE = fs2_pd1.ERS_COMPANY_CODE AND sales.SALE_SECTION = fs2_pd1.SALE_SECTION
  AND fs2_pd1.SALE_DATE BETWEEN '2016-01-16' AND '2016-04-30'
  AND fs2_pd1.SALE_STOCKAGE_EXACT BETWEEN 91 AND 180
...
   WHERE sales.ERS_COMPANY_CODE = 48
   GROUP BY sales.SALE_SECTION

This way, the optimizer can use more than one index for the query.

I suggest, however, to first try the composite index @Thorsten Kettner recommends as that may work out to the same effect with much less complexity.

1 Comment

Sorry, this query taking more time than my query time. Tested with before and after adding covering index

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.