0

I have seen several question in SO and based in that I improved my sql query also. but it sometime take 12 second or it sometime takes 3 seconds to execute. so minimum time we can its 3 seconds. query is like this way

SELECT ANALYSIS.DEPARTMENT_ID
    ,SCORE.ID
    ,SCORE.KPI_ SCORE.R_SCORE
    ,SCORE.FACTOR_SCORE
    ,SCORE.FACTOR_SCORE
    ,SCORE.FACTOR_SCORE
    ,SCORE.CREATED_DATE
    ,SCORE.UPDATED_DATE
FROM SCORE_INDICATOR SCORE
    ,AG_SENTIMENT ANALYSIS
WHERE SCORE.TAG_ID = ANALYSIS.ID
    AND ANALYSIS.ORGANIZATION_ID = 1
    AND ANALYSIS.DEPARTMENT_ID IN (1,2,3,4,5)
    AND DATE (ANALYSIS.REVIEW_DATE) BETWEEN DATE ('2016-05-02') AND DATE ('2017-05-02')
ORDER BY ANALYSIS.DEPARTMENT_ID

now one table SCORE_INDIACATOR has 19345116 and later has 19057025 rows total. and I added index on ORGANIZATION_ID and department_id and another as combination of ORGANIZATION_ID and department_id . is there any other way to improve it or is it maximum I can achieve with this amount of data?

10
  • Do you have indexes for other columns used in the condition? Commented Jun 15, 2017 at 9:45
  • You don't need to define indexes for PK/FK as they are automatically indexed by MySQL. Commented Jun 15, 2017 at 9:46
  • yes some has indexes Commented Jun 15, 2017 at 9:46
  • How about REVIEW_DATE ? Commented Jun 15, 2017 at 9:46
  • just to understand, why did you put combined index on organization_ID and department_id ? Commented Jun 15, 2017 at 9:47

2 Answers 2

1

Here is checklist:

1) Make sure logs table (ANALYSIS) uses MyISAM engine (it's fast for OLAP queries).

2) Make sure that You've indexed ANALYSIS.REVIEW_DATE field.

3) Make sure that ANALYSIS.REVIEW_DATE is type of DATE (not CHAR, VARCHAR)

4) Change query (rearrange query plan):

SELECT 
    ANALYSIS.DEPARTMENT_ID
    ,SCORE.ID
    ,SCORE.KPI_ SCORE.R_SCORE
    ,SCORE.FACTOR_SCORE
    ,SCORE.FACTOR_SCORE
    ,SCORE.FACTOR_SCORE
    ,SCORE.CREATED_DATE
    ,SCORE.UPDATED_DATE
FROM SCORE_INDICATOR SCORE
    ,AG_SENTIMENT ANALYSIS
WHERE 
    SCORE.TAG_ID = ANALYSIS.ID 
  AND
    ANALYSIS.REVIEW_DATE >= '2016-05-02' AND ANALYSIS.REVIEW_DATE < '2016-05-03'
  AND
    ANALYSIS.ORGANIZATION_ID = 1
  AND 
    ANALYSIS.DEPARTMENT_ID IN (1,2,3,4,5)
ORDER BY ANALYSIS.DEPARTMENT_ID;
Sign up to request clarification or add additional context in comments.

2 Comments

Ya it helped but alas my manager didn't want to make that changes
@vihangshah I don't think doing 2,3,4 is big change. Adding index will make searching fast, ensuring that field id DATE also not a big change and it's more desirable, rearrange of query plan also not a change - it makes DB engine to work effective.
1

I have changed the order and style to JOIN syntax. The Score table seems to be the child to the primary criteria of the Analysis table. All your criteria is based on qualifying Analysis records. Now, the indexing. By doing a DATE() function call on a column does not help the optimizer. So, to get all possible date/time components, I have changed from between to >= the first date and LESS THAN one day beyond the end. In your example DATE( '2017-05-02' ) is the same as LESS than '2017-05-03' which will include 2017-05-02 up to 23:59:59 and the date can be applied better.

Now for the index. DO a compound index based on fields for join and order by might help

AG_Segment table... index ON(Organization_ID, Department_ID, Review_Date, ID)

SELECT 
        ANALYSIS.DEPARTMENT_ID,
        SCORE.ID,
        SCORE.KPI_ SCORE.R_SCORE,
        SCORE.FACTOR_SCORE,
        SCORE.FACTOR_SCORE,
        SCORE.FACTOR_SCORE,
        SCORE.CREATED_DATE,
        SCORE.UPDATED_DATE
    FROM 
        AG_SENTIMENT ANALYSIS
            JOIN SCORE_INDICATOR SCORE
                ON ANALYSIS.ID = SCORE.TAG_ID
    where 
            ANALYSIS.ORGANIZATION_ID = 1
        AND ANALYSIS.DEPARTMENT_ID IN (1,2,3,4,5)
        AND ANALYSIS.REVIEW_DATE >= '2016-05-02'
        AND ANALYSIS.REVIEW_DATE < '2017-05-03'
    ORDER BY 
        ANALYSIS.DEPARTMENT_ID

7 Comments

one thing I dint understand why do I need to add ID in primary index as it is already PK
@vihangshah, by including, it becomes a COVERING INDEX (you can look that up), but basically includes all the fields from the analysis table within the index so it does not have to go back to the raw data pages which could have time impacts if left out. You can try it both ways.. one with, one without.
hmm..that helped me to reduce my time by one 1 second
@vihangshah, ok, one second, but isn't that a 33% improvement from 3 seconds :)
hmm..ya. but in worst case it is 10 seconds so in that 20% improvement
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.