1

This question is more or less the same as this one: MySQL select rows that do not have matching column in other table; however, the solution there is not not practical for large data sets.

This table has ~120,000 rows.

CREATE TABLE `tblTimers` (
  `TimerID` int(11) NOT NULL,
  `TaskID` int(11) NOT NULL,
  `UserID` int(11) NOT NULL,
  `StartDateTime` datetime NOT NULL,
  `dtStopTime` datetime NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

ALTER TABLE `tblTimers`
  ADD PRIMARY KEY (`TimerID`);
ALTER TABLE `tblTimers`
  MODIFY `TimerID` int(11) NOT NULL AUTO_INCREMENT;

This table has about ~70,000 rows.

CREATE TABLE `tblWorkDays` (
  `WorkDayID` int(11) NOT NULL,
  `TaskID` int(11) NOT NULL,
  `UserID` int(11) NOT NULL,
  `WorkDayDate` date NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

ALTER TABLE `tblWorkDays`
  ADD PRIMARY KEY (`WorkDayID`);

ALTER TABLE `tblWorkDays`
  MODIFY `WorkDayID` int(11) NOT NULL AUTO_INCREMENT;

tblWorkDays should have one line per TaskID per UserID per WorkDayDate, but due to a bug, a few work days are missing despite there being timers for those days; so, I am trying to create a report that shows any timer that does not have a work day associated with it.

SELECT A.TimerID FROM tblTimers A
LEFT JOIN tblWorkDays B ON A.TaskID = B.TaskID AND A.UserID = B.UserID AND DATE(A.StartDateTime) = B.WorkDayDate
WHERE B.WorkDayID IS NULL

Doing this causes the server to time out; so, I am looking for if there is a way to do this more efficiently?

2
  • As well as SHOW CREATE TABLE statements for all relevant tables, questions about query performance always require the EXPLAIN for the given query Commented Jun 19, 2020 at 19:14
  • @Strawberry Added CREATE TABLE statements. Commented Jun 19, 2020 at 19:27

1 Answer 1

2

You don't have any indexes on the columns you're joining on, so it has to do full scans of both tables. Try adding the following:

ALTER TABLE tblTimers ADD INDEX (TaskID, UserID);
ALTER TABLE tblWorkDays ADD INDEX (TaskID, UserID);
Sign up to request clarification or add additional context in comments.

4 Comments

Query took 0.0031 seconds... well that is a big difference...
In general, whenever you're joining large tables, make sure you have indexes on the columns used in the join.
Thanks. Out of curiosity, are there any significant performance disadvantages to indexes you are not using any more, don't use often, or don't cover large enough of datasets to be necessary?
There's overhead, since they take up disk space and have to be updated when you modify the table data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.