1

I'm trying to adapt the solutions here (SQL Delete Rows Based on Another Table) to my needs. E.g.,

DELETE
FROM complete_set
WHERE slice_name IN (SELECT slice_name FROM changes
                     GROUP BY slice_name HAVING COUNT(slice_name) > 1);

Tables definitions:

  • Table1 ... Name: changes, Fields: Id, slice_name, slice_value, Rows: Approx. 100 Thousand.
  • Table2 ... Name: complete_set, Fields: Id, slice_name, slice_value, Rows: Approx. 3 million.

While running the query's components individually is extremely fast ...

E.g.,

SELECT slice_name 
FROM changes 
GROUP BY slice_name 
HAVING COUNT(sym) > 1;

(off-the-cuff about a second), and

DELETE FROM complete_set 
WHERE slice_name = 'ABC'

(also about a second, or so)

The above solution (w/ subquery) takes too long to execute be useful. Is there an optimization I can apply here?

Thanks for the assist.

1
  • What version of MySQL? Commented Jun 13, 2018 at 20:34

2 Answers 2

1

One possible explanation for the slow delete is that takes some time for MySQL to lookup each slice_name in the complete_set table against the values in the subquery. We can try speeding this up as follows. First, create a new table to replace the subquery, which will serve as a materialized view:

CREATE TEMPORARY TABLE changes_view
(PRIMARY KEY pkey (slice_name))
SELECT slice_name
FROM changes
GROUP BY slice_name
HAVING COUNT(slice_name) > 1;

Now phrase your delete using a join:

DELETE t1
FROM complete_set t1
INNER JOIN changes_view t2
    ON t1.slice_name = t2.slice_name;

The (intended) trick here is that the delete join should run fast because MySQL can quickly lookup a slice_name value in the complete_set table against the materialized view table, since the latter has an index on slice_name.

Sign up to request clarification or add additional context in comments.

1 Comment

Thx Tim. I'm going to try and implement this @ tonight's update.
1

If the table size is too big the above execution will definitely take lot of time because the inner query shall run for every outer query row during the deletion.
The deletion would be much quicker if all the individual deletion statement is defined separately and executed in a batch or sequentially.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.