0

I have inherited some code, the original author is not contactable and I would be extremely grateful for any assistance as my own MySQL knowledge is not great.

I have the following query that is taking around 4 seconds to execute, there is only around 20,000 rows of data in all the tables combined so I suspect the query could be made more efficient, perhaps by splitting it into more than one query, here it is:

    SELECT SQL_CALC_FOUND_ROWS ci.id AS id, ci.customer AS customer, ci.installer AS installer, ci.install_date AS install_date, ci.registration AS registration, ci.wf_obj AS wf_obj, ci.link_serial AS link_serial, ci.sim_serial AS sim_serial, sc.call_status AS call_status 
    FROM ap_servicedesk.corporate_installs AS ci 
    LEFT JOIN service_calls AS sc ON ci.wf_obj = sc.wf_obj
    WHERE ci.acc_id = 3 
    GROUP BY ci.id
    ORDER BY  link_serial
                asc
    LIMIT 40, 20

Can anyone spot any way to make this more efficient, thanks.

(Some values are set as variables but running the above query in PHPMyAdmin takes ~4secs)

The id column is the primary index.

More Info as requested:

corporate_installs table:

Field       Type    Null    Key Default Extra

id              int(11) NO  PRI NULL    auto_increment
customer        varchar(800)    NO      NULL    
acc_id      varchar(11) NO      NULL    
installer       varchar(50) NO      NULL    
install_date    varchar(50) NO      NULL    
address_name    varchar(30) NO      NULL    
address_street  varchar(40) NO      NULL    
address_city    varchar(30) NO      NULL    
address_region  varchar(30) NO      NULL    
address_post_code   varchar(10) NO      NULL    
latitude            varchar(15) NO      NULL    
longitude           varchar(15) NO      NULL    
registration    varchar(50) NO      NULL    
driver_name         varchar(50) NO      NULL    
vehicle_type    varchar(50) NO      NULL    
make            varchar(50) NO      NULL    
model           varchar(50) NO      NULL    
vin                 varchar(50) NO      NULL    
wf_obj          varchar(50) NO      NULL    
link_serial         varchar(50) NO      NULL    
sim_serial          varchar(50) NO      NULL    
tti_inv_no          varchar(50) NO      NULL    
pro_serial          varchar(50) NO      NULL    
eco_serial          varchar(50) NO      NULL    
eco_bluetooth   varchar(50) NO      NULL    
warranty_expiry varchar(50) NO      NULL    
project_no          varchar(50) NO      NULL    
status          varchar(15) NO      NULL    

service_calls table:

Field           Type           Null Key Default Extra
id                  int(11)     NO      PRI NULL    auto_increment
acc_id          int(15)         NO      NULL    
ciid            int(11)         NO      NULL    
installer_job_no    varchar(50) NO      NULL    
installer_inv_no    varchar(50) NO      NULL    
engineer            varchar(50) NO      NULL    
request_date    varchar(50) NO      NULL    
completion_date varchar(50) NO      NULL    
call_status         varchar(50) NO      NULL    
registration    varchar(50) NO      NULL    
wf_obj          varchar(50) NO      NULL    
driver_name         varchar(50) NO      NULL    
driver_phone    varchar(50) NO      NULL    
team_leader_name    varchar(50) NO      NULL    
team_leader_phone   varchar(50) NO      NULL    
servicing_address   varchar(150)    NO      NULL    
region          varchar(50) NO      NULL    
post_code           varchar(50) NO      NULL    
latitude            varchar(50) NO      NULL    
longitude           varchar(50) NO      NULL    
incident_no         varchar(50) NO      NULL    
service_type    varchar(20) NO      NULL    
fault_description   varchar(50) NO      NULL    
requested_action    varchar(50) NO      NULL    
requested_replacemt varchar(100)    NO      NULL    
fault_detected  varchar(50) NO      NULL    
action_taken    varchar(50) NO      NULL    
parts_used          varchar(50) NO      NULL    
new_link_serial varchar(50) NO      NULL    
new_sim_serial  varchar(50) NO      NULL    

(Apologies for the formatting, I did the best I could)

Let me know if you need more info thanks.

Further info (I did the query again with EXPLAIN):

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  ci  ALL acc_id  NULL    NULL    NULL    7227    Using where; Using temporary; Using filesort
1   SIMPLE  sc  ALL NULL    NULL    NULL    NULL    410 
2
  • Can you also provide the table structure as your issue may be more so there rather than the query Commented Feb 16, 2012 at 11:27
  • Perhaps better on code review or database administrators Commented Feb 16, 2012 at 11:29

3 Answers 3

2

Add indices on the two wf_obj columns, the link_serial column (you may also need an index on the acc_id, too).

Then try this version:

SELECT ...
FROM 
      ( SELECT * 
        FROM ap_servicedesk.corporate_installs
        WHERE acc_id = 3 
        ORDER BY link_serial ASC
        LIMIT 60
      ) AS ci 
  LEFT JOIN service_calls AS sc 
    ON sc.PK =                          --- the PRIMARY KEY of the table
    ( SELECT PK
      FROM service_calls AS scm
      WHERE ci.wf_obj = scm.wf_obj
      ORDER BY scm.    --- whatever suits you
      LIMIT 1
    )
ORDER BY ci.link_serial ASC
LIMIT 20  OFFSET 40

The ORDER BY scm.SomeColumn is needed not for performance but to get consistent results. Your query as it is, is joining a row from the first table to all related rows of the second table. But the final GROUP BY aggregates all these rows (of the second table), so your SELECT ... sc.call_status picks a more or less random call_status from one of these rows.

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks, I'm getting an error: #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SELECT PK FROM service_calls AS scm WHERE ci.wf_obj = scm.wf_obj ' at line 11
Don't write PK. Put the Primary Key column there instead.
Many thanks, seems much quicker now, let me get it working properly, will report back.
It appears that your advice regards indices has solved the problem, query now only 0.05s still using my old query, yours is marginally quicker still. Many thanks again, to everyone who contributed.
If you keep your original query, keep in mind that call_status may give a somewhat random result, due to the GROUP BY.
2

The first place I'd look on this would have to be indexes.

There is a group on ci.id which is the PK which is fine, however you are ordering by link_ser (source table unspecified) and you are selecting based on ci.acc_id.

If you add an extra key on the table corp_installs for the field acc_id then that alone should help increase performance as it will be usable for the WHERE clause.

Looking further you have ci.wf_obj = sc.wf_obj within the join. Joining on a VARCHAR will be SLOW, and you are not actually using this as part of the selection criteria and so a SUBQUERY may be your friend, consider the following

SELECT
  serviceCallData.*,
  sc.call_status AS call_status

FROM (
  SELECT 
    SQL_CALC_FOUND_ROWS AS found_rows,
    ci.id AS id, 
    ci.customer AS customer, 
    ci.installer AS installer, 
    ci.install_date AS install_date, 
    ci.registration AS registration, 
    ci.wf_obj AS wf_obj, 
    ci.link_serial AS link_serial, 
    ci.sim_serial AS sim_serial

  FROM ap_servicedesk.corporate_installs AS ci 
  WHERE ci.acc_id = 3 
  GROUP BY ci.id
  ORDER BY ci.link_serial ASC
  LIMIT 40, 20
) AS serviceCallData
LEFT JOIN serice_calls AS sc ON serviceCallData.wf_obj = sc.wf_obj

In addition to this, change that (acc_id) key to be (acc_id, link_serial) as then it will be usable in the sort. Also add a key on (wf_obj) into serice_calls.

This will select the 20 rows from the corpoprate_installs table and then only join them onto the service_calls table using the inefficient VARCHAR join

I hope this is of help

2 Comments

Thanks, it's 2 tenths of a second quicker now but I had hoped for a bit better. Any more advice is appreciated.
Having looked at your schema, the biggest alarm bell is your LEFT JOIN on ci.wf_obj = sc.wf_obj, where both are VARCHAR(50). This will be slow especially on larger datasets. Another small change you could make is to add a key for wf_obj into the service_calls table, and change that earlier key to be acc_id, wf_obj however this would probably not get a great improvement. Will update answer shortly
1

I think the SQL_CALC_FOUND_ROWS option used with a join and a group by could be degrading the performance (look here for some tests, info on SQL_CALC_FOUND_ROWS here). It seems in facts that indexes are not used in that case.

Try replacing your query with two separate queries, the one with the LIMIT followed by a COUNT().

1 Comment

Thanks, funnily enough I'd read that article before posting on here, with the amount of data I'm working with I doubt that splitting the query up would do much but I will bear this in mind once we start getting more data. Appreciate the advice.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.