How to remove duplicate rows in postgresql? [closed]

Question

Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Guide the asker to update the question so it focuses on a single, specific problem. Narrowing the question will help others answer the question concisely. You may edit the question if you feel you can improve it yourself. If edited, the question will be reviewed and might be reopened.

Closed 7 years ago.

Improve this question

I would like to remove duplicate entries in Postgresql.

There is no unique constraint, but I would like to consider all columns together to consider a row as a duplicate.

So we have a table containing following rows :

id   |   name   | age           | started_date  |Score |
-----|----------|---------------|---------------|------|
1    | tom      | 15            | 01/06/2022    |5     |
2    | tom      | 15            | 01/06/2022    |5     |
3    | henry    | 10            | 01/06/2022    |4     |
4    | john     | 11            | 01/06/2022    |6     |
...

I would like to consider all columns together to identify the duplicate rows.

How to achieve this in Postgresql ?

What have you tried? What's your schema? What's the sample data? — David Brossard
– David Brossard, Commented Jun 10, 2018 at 6:28
What do you mean by "remove"? In a query or in the data itself? — Gordon Linoff
– Gordon Linoff, Commented Jun 10, 2018 at 11:33

Mureinik · Accepted Answer · 2018-06-10 06:51:27Z

7

PostgreSQL assigns a ctid pseudo-column to identify the physical location of each row. You could use that to identify different rows with the same values:

-- Create the table
CREATE TABLE my_table (num1 NUMERIC, num2 NUMERIC);

-- Create duplicate data
INSERT INTO my_table VALUES (1, 2);
INSERT INTO my_table VALUES (1, 2);

-- Remove duplicates
DELETE FROM my_table
WHERE ctid IN (SELECT ctid
               FROM   (SELECT ctid,
                              ROW_NUMBER() OVER (
                                PARTITION BY num1, num2) AS rn
                       FROM   my_table) t
               WHERE  rn > 1);

DB Fiddle

answered Jun 10, 2018 at 6:51

Mureinik

316k54 gold badges400 silver badges405 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Gaurav · Accepted Answer · 2018-06-10 06:33:28Z

0

Let say your table has 2 columns, you can identify duplicates using. Post this :-

1) Insert this result into a temp table

2) Drop data from Main table

3) Insert data from temp table into main table

4) Drop temp table.

select col1, col2, count(*) as cnt
from table1
group by col1, col2
having  cnt > 1

answered Jun 10, 2018 at 6:33

Gaurav

1,09910 silver badges11 bronze badges

2 Comments

yome Over a year ago

Is there a way to delete found duplicate data directly from the table?

Gaurav Over a year ago

SQL mentioned in answer will do the same.

Collectives™ on Stack Overflow

How to remove duplicate rows in postgresql? [closed]

2 Answers 2

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Linked

Related