Delete Duplicate Data on PostgreSQL

Question

How to delete duplicate data on a table which have kind data like these. I want to keep it with the latest updated_at at each attribute id.

Like as follows:

attribute id | created at          | product_id
1            | 2020-04-28 15:31:11 | 112235
4            | 2020-04-28 15:30:25 | 112235
1            | 2020-04-29 15:30:25 | 112236
4            | 2020-04-29 15:30:25 | 112236

Looks a bit like this: stackoverflow.com/questions/1313120/… — sequoia
– sequoia, Commented May 3, 2020 at 11:32
it's good using this link, but why if i have another product_id then it'll deleted. — Deniel Lin
– Deniel Lin, Commented May 3, 2020 at 11:44
Unrelated to your problem, but: Postgres 9.3 is no longer supported you should plan an upgrade as soon as possible. — user330315
– user330315, Commented May 4, 2020 at 6:54

user330315 · Accepted Answer · 2020-05-04 06:35:03Z

2

You can use an EXISTS condition.

delete from the_table t1
where exists (select *
              from the_table t2
              where t2.created_at > t1.created_at
                and t2.attribute_id = t1.attribute_id);

This will delete all rows where another row for the same attribute_id exists that has bigger created_at value (thus keeping only the row with the highest created_at for each attribute_id). Note that if two created_at values are identical, nothing will be deleted for that attribute_id

Collectives™ on Stack Overflow

Delete Duplicate Data on PostgreSQL

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related