0

I have following data:

Id: 1       Name:   apple       ForeignKey: 10
Id: 2       Name:   apple       ForeignKey: 10
Id: 3       Name:   apple       ForeignKey: 15
Id: 4       Name:   peach       ForeignKey: 11
Id: 5       Name:   peach       ForeignKey: 12

Rows with same Name and ForeignKey are duplicates in my case. Now I want to remove all the duplicates except one instance from the table.

In other words; I want to remove all rows but one, where Name and ForeignKey are equal.

In case with upper data only the row with Id 2 OR Id 1 should be removed.

With

select count(Name), Name, ForeignKey group by Name, ForeignKey having count(Name)>1

I am able to find items where more than 1 row with same Name and Foreign key exist, but how to get the IDs of those rows? And how to get the IDs of those rows except the first/last occurrence of that row with same Name and ForeignKey?

4
  • 1
    Which DBMS are you using? Postgres? Oracle? Commented May 13, 2014 at 14:10
  • 2
    Once you mopped the floor, remember to fix the leak - if you don't want duplicates in your database, apply a unique constraint against (Name,ForeignKey) (once you've removed the duplicates once) Commented May 13, 2014 at 14:18
  • @Damien_The_Unbeliever fixing the leak was the first I have done ;-) Commented May 13, 2014 at 14:25
  • possible duplicate of How do I delete duplicate data from SQL table Commented May 13, 2014 at 15:26

3 Answers 3

1

The answer if database specific, but here is how you can do it joining the table to itself:

delete t1
from yourtable t1
    join yourtable t2 on t1.id > t2.id
        and t1.name = t2.name 
        and t1.foreignkey = t2.foreignkey
Sign up to request clarification or add additional context in comments.

1 Comment

WOW, that was fast. Thank you again.
0

You can also do it with a CTE & window function, deleting the duplicate rows by counting the number of rows that are the same, and then deleting all but one.

SQL Fiddle demo

;WITH myvals
AS (
    SELECT [id]
        ,[name]
        ,[foreignkey]
        ,ROW_NUMBER() OVER (
            PARTITION BY [name]
            ,[foreignkey] ORDER BY [id]
                ,[name]
                ,[foreignkey]
            ) AS inst_count
    FROM yourtable
    )
DELETE
FROM myvals
WHERE inst_count > 1;

Comments

0

delete x from ( select *, rn=row_number() over (partition by name,foreignkey order by name) from yourtable ) x where rn > 1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.