17

I have the below table and now I need to delete the rows which are having duplicate "refIDs" but have atleast one row with that ref, i.e i need to remove row 4 and 5. please help me on this

+----+-------+--------+--+
| ID | refID |  data  |  |
+----+-------+--------+--+
|  1 |  1023 | aaaaaa |  |
|  2 |  1024 | bbbbbb |  |
|  3 |  1025 | cccccc |  |
|  4 |  1023 | ffffff |  |
|  5 |  1023 | gggggg |  |
|  6 |  1022 | rrrrrr |  |
+----+-------+--------+--+
5
  • Use min() or max() function Commented Feb 10, 2015 at 12:48
  • Refer to following question stackoverflow.com/questions/18932/… Commented Feb 10, 2015 at 12:50
  • Do you mean you want to select rows, but exclude row 4 and 5, or do you really want to delete them from you table? Commented Feb 10, 2015 at 12:52
  • 1
    This question may be a duplicate of something, but it is tagged MySQL and the referenced question used SQL Server syntax. Commented Feb 10, 2015 at 12:59
  • @jarlh I need to delete them Commented Feb 10, 2015 at 14:25

4 Answers 4

37

This is similar to Gordon Linoff's query, but without the subquery:

DELETE t1 FROM table t1
  JOIN table t2
  ON t2.refID = t1.refID
  AND t2.ID < t1.ID

This uses an inner join to only delete rows where there is another row with the same refID but lower ID.

The benefit of avoiding a subquery is being able to utilize an index for the search. This query should perform well with a multi-column index on refID + ID.

Sign up to request clarification or add additional context in comments.

3 Comments

To remove the rows but keep the ones with the highest ID instead of the lowest, just swap the condition around, i.e. AND t2.ID > t1.ID
Note that it doesn't work with temporary tables
Be sure both refID and ID are indexed, else it could take lot of time
5

I would do:

delete from t where 
ID not in (select min(ID) from table t group by refID having count(*) > 1)
and refID in (select refID from table t group by refID  having count(*) > 1)

criteria is refId is among the duplicates and ID is different from the min(id) from the duplicates. It would work better if refId is indexed

otherwise and provided you can issue multiple times the following query until it does not delete anything

delete from t 
where 
ID in (select max(ID) from table t group by refID  having count(*) > 1) 

Comments

5

Some another variant, in some cases a bit faster than Marcus and NJ73 answers:

DELETE ourTable 
FROM ourTable JOIN 
 (SELECT ID,targetField 
  FROM ourTable 
  GROUP BY targetField HAVING COUNT(*) > 1) t2 
ON ourTable.targetField = t2.targetField AND ourTable.ID != t2.ID;

Hope that will help someone. On big tables Marcus answer stalls.

2 Comments

Note: which row's ID this keeps (from the duplicates) is not specified in MySQL - because SELECT ID,targetField .. could return any of the IDs from the duplicate rows. This works if either a) you don't care which of the duplicate rows is kept or b) you want to keep the one which MySQL's current implemention keeps [probably the first one encountered, but be sure to test with your query and MySQL version].
Also, this query would delete one of the duplicates. If there are 3 records with the same target field value, then you'd have to run this again to remove the second duplicate. You'll basically have to run this query over and over till all duplicates are removed.
4

In MySQL, you can do this with a join in delete:

delete t
    from table t left join
         (select min(id) as id
          from table t
          group by refId
         ) tokeep
         on t.id = tokeep.id
    where tokeep.id is null;

For each RefId, the subquery calculates the minimum of the id column (presumed to be unique over the whole table). It uses a left join for the match, so anything that doesn't match has a NULL value for tokeep.id. These are the ones that are deleted.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.