0

I searched on the internet but could not find any explanation for my question below. The following code is not working when I try to remove duplicate rows from a test table. The test table has no unique or primary keys on it, and is defined as test(id number). Can you explain me why it is not working?

delete from test a 
where a.rowid in (select b.rowid 
                  from test b 
                  where b.id = a.id 
                  and b.rowid != a.rowid);
5
  • this might help you clear out about rowid. stackoverflow.com/questions/2701782/rowid-oracle-any-use-for-it Commented Jul 8, 2014 at 12:46
  • possible duplicate of How can I remove duplicate rows? Commented Jul 8, 2014 at 12:48
  • 1
    What does "not working" mean? Commented Jul 8, 2014 at 12:58
  • Charlesliam I think I know what a rowid is, and Daniel no this is not a duplicate question, I am just searching for an explanation. And yet, I could not find a logical explanation why the above code is not working. If you provide me with a nice logical explanation I will be grateful. Thanks in advance. Commented Jul 8, 2014 at 13:03
  • 'Not working' means it does not remove the duplicate rows. Commented Jul 8, 2014 at 13:04

2 Answers 2

2

The normal way this query would work is more like:

delete from test a 
where a.rowid > (select min(b.rowid)
                 from test b 
                 where b.id = a.id
                );

Your query is removing no rows because of the correlation. You are asking for a.rowid being in a list where it is specifically excluded by b.rowid != a.rowid. Hence, no rows can match this condition -- of being in a list they are specifically excluded from.

Sign up to request clarification or add additional context in comments.

4 Comments

No, inner subquery just gives the rowid's that are different from the ones in the outer delete. And when you run the query in question, it just deletes 0 rows.
As far as I know, the inner subquery should return the result for each row of the table in the outer delete statement, and only after that the delete statement should execute. For instance, for the first row of the table in the outer delete, the delete statement should send the rowid of the first row to the inner subquery and then wait for the result that should return. And then the delete statement executes, in which case the subquery should return the rowid's other than the one that is passed to subquery (ie. b.rowid != a.rowid). Am I wrong? What is the error in this statement
When I run the subquery with replacing actual values of a.rowid and a.id from the queried table (ie. select b.rid from test2 b where b.id = 1 and b.rid != 'XXX'), the subquery returns exactly as expected (ie, it gives the rowid's other than the one that is passed to subquery). Yet using it in the delete statement as a subquery does not work. What is the reason for that?
@Gordon...Ok now I see why this sql is not working thank you Gordon.. I thought that the left hand side of the "in" is just for searching the rowid's in the list that is returned by the select statement. In which case the "in" statement cannot be defined as a logical expression.
1

This should help:

delete from test where rowid in
(
    select rowid from
      (select rowid,rank() over (partition by id order by rowid) rank 
       from test)temp_test 
    where temp_test.rank >1
);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.