0

We want to delete duplicated rows on our MySQL database, and we have tried a lot of queries, but for unfortunately we haven't succeeded yet. We found this query on several posts, but didn't work either:

DELETE t1 FROM Raw_Validated_backup AS t1 INNER JOIN Raw_Validated_backup AS t2 
    ON t1.time_start=t2.time_start 
    AND t1.time_end=t2.time_end 
    AND t1.first_temp_lpn=t2.first_temp_lpn 
    AND t1.first_WL=t2.first_WL 
    AND t1.first_temp_lpn_validated=t2.first_temp_lpn_validated 
    AND t1.second_temp_lpn=t2.second_temp_lpn 
    AND t1.second_WL=t2.second_WL 
    AND t1.second_temp_lpn_validated=t2.second_temp_lpn_validated 
    AND t1.third_temp_lpn=t2.third_temp_lpn 
    AND t1.third_WL=t2.third_WL 
    AND t1.third_temp_lpn_validated=t2.third_temp_lpn_validated 
    AND t1.first_temp_rising=t2.first_temp_rising 
    AND t1.first_WR=t2.first_WR 
    AND t1.first_temp_rising_validated=t2.first_temp_rising_validated 
    AND t1.second_temp_rising=t2.second_temp_rising 
    AND t1.second_WR=t2.second_WR 
    AND t1.second_temp_rising_validated=t2.second_temp_rising_validated 
    AND t1.third_temp_rising=t2.third_temp_rising 
    AND t1.third_WR=t2.third_WR 
    AND t1.third_temp_rising_validated=t2.third_temp_rising_validated 
    AND t1.id<t2.id;

Message we receive after running query: No errors, 0 rows affected, taking 40,4 s

1
  • Incidentally, if operationally possible, it's often far quicker to create a new table, retaining just the rows you want to keep, and then dropping/archiving the old table and renaming the new one. Commented Jan 6, 2020 at 11:49

1 Answer 1

2

This query:

select max(id) id
from Raw_Validated_backup
group by <list of all the columns except id>

returns all the ids for the rows that you want to keep.
So delete the rest:

delete from Raw_Validated_backup
where id not in (
  select t.id from (
    select max(id) id
    from Raw_Validated_backup
    group by <list of all the columns except id>
  ) t
)

See the demo.
Another option with a self join:

delete v1 
from Raw_Validated_backup v1 inner join Raw_Validated_backup v2
on v1.time_start = v2.time_start and v1.time_end = v2.time_end and .......
and v1.id < v2.id;

See a simplified demo.

Sign up to request clarification or add additional context in comments.

4 Comments

Use USING (fieldslist) instead of ON - it is shorter and more clear.
Just tested and worked as wished. Thank you very much!
@Akina I agree using should be handy in this case but then the last condition v1.id < v2.id should be moved to a WHERE clause.
the last condition v1.id < v2.id should be moved to a WHERE clause Of course.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.