1

As an assignment I need to clean up a movie database.

Some users has been deleted, and i have to remove ratings from users no longer in the database.

I made this query

DELETE rating FROM rating LEFT JOIN (SELECT id FROM user) as A ON A.id = rating.userId WHERE A.id IS NULL;

I've made indexes on rating.userId and user.Id

As there is 6000 users and 1.000.000 ratings, this takes insanely long time. Can anyone figure out how i can perform this, or a query like this, with better performance?

6
  • A.id = rating.userId WHERE A.id IS NULL isn't that equivalent to rating.userId = NULL? Commented Nov 25, 2013 at 14:33
  • 2
    @njzk2 not necessarily, I suspect this code is to delete data from rating where the user no longer exists. If the database had referential integrity configured, OP probably wouldn't have to write this. Commented Nov 25, 2013 at 14:35
  • Can you quantify "Insanely long time"? Commented Nov 25, 2013 at 14:37
  • Insanely long time is not completing within 30mins. It completed in 23 secs with @juergen-d is answer :) Commented Nov 25, 2013 at 14:47
  • shouldn't you use sonmething like on delete cascade? Commented Nov 25, 2013 at 16:36

2 Answers 2

7

Remove the unnecessary subselect

DELETE rating 
FROM rating 
LEFT JOIN user ON user.id = rating.userId 
WHERE user.id IS NULL
Sign up to request clarification or add additional context in comments.

Comments

0

Run this SELECT query and confirm that these are the rows you want to delete.

SELECT * FROM rating WHERE UserId NOT IN (SELECT Id FROM user);

If they match you can simply run a query like this.

DELETE FROM rating WHERE UserId NOT IN (SELECT Id FROM user);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.