1

I have the following database and want to delete the red ones because they are doubouled. So I have to check every row if another row is matching by pid, price, price_old, link and shop.

But how can I check that and how can I delete it then?

enter image description here

Maybe an easier way would be to generate a id from the values inside each row. So if the values inside a row would be equal also the id would be equal and who have only one value to compare with the other id's. Is that a better way? - If yes, how can I do that?

Greetings!

4 Answers 4

1

Do the fact you have no way for get thi distinct row you could add uniqie id using

ALTER TABLE my_table
 ADD id int NOT NULL AUTO_INCREMENT

Once done you could use not in where the id are not the min grouped by the value you need for define the duplication

delete from my_table 
where id  NOT in ( select min(id) from my_table
                  group by shop, link 
                  )
Sign up to request clarification or add additional context in comments.

1 Comment

I do not have an ID yet... How could I let me generate an ID from the values - pid, price, price_old, link and shop?
0

The simplest way is to run a distinct query:

select distinct pid, price, price_old, link, shop
from t;

You can create a new table using into. That is the simplest way. Because all columns are the same, MySQL doesn't offer a simple method to delete duplicate rows (while leaving one of them).

However, it is possible that your current results are generated by a query. If so, you can just add select distinct to the query. However, it would be better to fix the query so it doesn't generate duplicates. If this is the case, then ask another question with sample data, desired results (as text, not an image), and the query you are currently using.

2 Comments

No it doesn't come from an query. I have an cronjob who is running every 24hrs. But my problem is that this Cronjob would add the new data and some values will be still the same as in the cronjob before but some values will be different as in the cronjob before. So my only way is to delete all duplicated rows while leaving one of them.
@Jan . . . You should fix the cron job so it doesn't insert duplicate values. In addition, each row should (at a minimum) have a unique id (auto-incremented) and a creation date/time.
0

Test this first on a test table:

DELETE t1 
FROM t t1, t t2 
WHERE t1.id > t2.id AND t1.price = t2.price 
    AND t1.link = t2.link AND t1.shop = t2.shop 
    AND t1.price_old = t2.price_old;

Basically you are removing the one with the highest ID if those parameters are equal

Comments

0
select * from 
(select pid, price, price_old, link ,
row_number() over(partition by pid, price, price_old, link, shop order by pid) as rank
from my_table) temp
where temp.rank = 1

This Query will group by all the columns first and rank them. Duplicate rows will have rank > 1. It does not matter we take first or second row as both are copy of each other. We just take rows with rank 1. Rows that are not duplicate will also be having rank 1 and hence won't be neglected.

One more way to this is by using union.

select * from my_table UNION select * from my_table

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.