224

I'm trying to delete orphan entries in a mysql table.

I have 2 tables like this:

Table files:

id ....
1 ....
2 ....
7 ....
9 ....

table blob:

fileid ....
1 ....
2 ....
3 ....
4 ....
4 ....
4 ....
9 ....

The fileid and id columns can be used to join the tables together.

I want to delete all rows in table blob where fileid cannot be found in the table files.id.

So using the example above that would delete rows: 3 & 4(s) in the blob table.

1
  • 1
    Skip to the second answer if you are using nulls. Commented Apr 12, 2015 at 17:44

4 Answers 4

427

Using LEFT JOIN/IS NULL:

DELETE b FROM BLOB b 
  LEFT JOIN FILES f ON f.id = b.fileid 
      WHERE f.id IS NULL

Using NOT EXISTS:

DELETE FROM BLOB 
 WHERE NOT EXISTS(SELECT NULL
                    FROM FILES f
                   WHERE f.id = fileid)

Using NOT IN:

DELETE FROM BLOB
 WHERE fileid NOT IN (SELECT f.id 
                        FROM FILES f)

Warning

Whenever possible, perform DELETEs within a transaction (assuming supported - IE: Not on MyISAM) so you can use rollback to revert changes in case of problems.

Sign up to request clarification or add additional context in comments.

15 Comments

which is, in general, the fastest of the above?
For some reason, deleting using the LEFT JOIN didn't work on MS SQL Server Mgmt Studio (not sure why; it just complained about the LEFT JOIN). Anyone knows why is that? It worked using NOT EXISTS though :)
FYI, here's a useful discussion of the relative efficiency of these three methods: explainextended.com/2009/09/18/…
@Pacerier - "wrong" is a bit strong. To make sure people understand, the answers do work if fileidis non-nullable. Also, the third solution (NOT IN) only requires that f.id by non-nullable. Presumably that is a primary key, so it would be.
For people attempting this w/SQLite: see this answer
|
33
DELETE FROM blob 
WHERE fileid NOT IN 
       (SELECT id 
        FROM files 
        WHERE id is NOT NULL/*This line is unlikely to be needed 
                               but using NOT IN...*/
      )

8 Comments

What is "/*This line is unlikely to be needed but using NOT IN...*/" supposed to mean?
@Pacerier - NOT IN (NULL) returns an empty result set so NULLs need to be excluded. But an id column probably won't be nullable anyway hence "unlikely to be needed"
@bunkerdive Then use three part object names that include the database name.
@Pacerier - The important sentence is "This line is unlikely to be needed". id traditionally is a non-nullable primary key.
|
24
DELETE FROM blob
WHERE NOT EXISTS (
    SELECT *
    FROM files
    WHERE id=blob.id
)

1 Comment

I think there is a files.id and blob.fileid. I'm guessing your query will result in an error.
2
DELETE FROM <table> 
WHERE <row column you want to delete by > not in 
(select <column you want to compare other column> from <other table>)

This deletes the row when the first column doesn't appear on the second table

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.