0

How to delete duplicate data on a table which have kind data like these. I want to keep it with the latest updated_at at each attribute id.

Like as follows:

attribute id | created at          | product_id
1            | 2020-04-28 15:31:11 | 112235
4            | 2020-04-28 15:30:25 | 112235
1            | 2020-04-29 15:30:25 | 112236
4            | 2020-04-29 15:30:25 | 112236
4
  • postgresqltutorial.com/… Commented May 3, 2020 at 9:16
  • Looks a bit like this: stackoverflow.com/questions/1313120/… Commented May 3, 2020 at 11:32
  • it's good using this link, but why if i have another product_id then it'll deleted. Commented May 3, 2020 at 11:44
  • 1
    Unrelated to your problem, but: Postgres 9.3 is no longer supported you should plan an upgrade as soon as possible. Commented May 4, 2020 at 6:54

1 Answer 1

2

You can use an EXISTS condition.

delete from the_table t1
where exists (select *
              from the_table t2
              where t2.created_at > t1.created_at
                and t2.attribute_id = t1.attribute_id);

This will delete all rows where another row for the same attribute_id exists that has bigger created_at value (thus keeping only the row with the highest created_at for each attribute_id). Note that if two created_at values are identical, nothing will be deleted for that attribute_id

Online example

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.