0

I have this script running to check for duplicates in my table:

select s.id, t.* 
from [stuff] s
join (
   select name, city, count(*) as qty
   from [stuff]
   group by name, city
   having count(*) > 1
) t on s.name = t.name and s.city = t.city

This works fine and returns the ID's of the duplicate rows:

myresult = cur.fetchall()
print(myresult)

Example output:

[(84,), (85,), (339,), (340,), (351,), (352,), (416,), (417,), (511,), (512,), (532,), (533,), 
(815,), (816,), (978,), (979,), (1075,), (1076,), (1385,), (1386,), (1512,)]

Now I want to delete records 84, 339, 351, 416, etc. What would be the most convenient way to do so?

2
  • DELETE FROM PRODUCTS WHERE ..... You don't need Python at all. BTW in MySQL 8 and later it's easier and faster to use ROW_NUMBER() Commented Feb 7, 2022 at 12:42
  • Delete from products where ID = 84, 339, 351 etc. But how do I select the one of the two records? can I fetch this in my current sql statement somehow? I'm not an sql expert so it is a bit trial and error for me Commented Feb 7, 2022 at 12:53

2 Answers 2

1

MySQL provides you with the DELETE JOIN statement that allows you to remove duplicate rows quickly.

The following statement deletes duplicate rows and keeps the highest id:

DELETE t1 FROM table_name t1
INNER JOIN table_name t2 
WHERE 
    t1.id < t2.id AND 
    t1.unique_col = t2.unique_col;

In case you want to delete duplicate rows and keep the lowest id, you can use the following statement:

DELETE t1 FROM table_name t1
INNER JOIN table_name t2 
WHERE
    t1.id > t2.id AND 
    t1.unique_col = t2.unique_col;

Sign up to request clarification or add additional context in comments.

4 Comments

So for me that would be: DELETE t1 FROM stuff t1 INNER JOIN stuff t2 WHERE t1.id > t2.id AND t1.name = t2.name AND t1.city = t2.city
yes, give a try and let me know if it not works
Ok, I will try with a SELECT first ;-) Just to be sure SELECT t1 FROM stuff t1 INNER JOIN stuff t2 WHERE t1.id > t2.id AND t1.name = t2.name AND t1.city = t2.city
Glad to know that. Please give an upvote
0

you can remove duplicate rows in MySQL in this way

      WHERE customer_id NOT IN
      (
          SELECT
          customer_id
          FROM
          (
              SELECT MIN(customer_id) as customer_id
              FROM CUSTOMERS
              GROUP BY CONCAT(first_name, last_name, phone)
          ) AS duplicate_customer_ids
      );`

   

2 Comments

may use MAX or MIN function depends of your requirement
other thing...the above example is in the case of all of your records are duplicated. But, if there are somes uniques records, you neeed include other sql query with UNION statement. This for not delete this uniques records. Algo como .. ```SQL WHERE customer_id NOT IN .. UNION SELECT customer_id FROM ( SELECT MIN(customer_id) as customer_id FROM CUSTOMERS GROUP BY CONCAT(first_name, last_name, phone) having count(*) = 1 ) AS duplicate_customer_ids )

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.