0

This is my query, I plan to run it in batches of perhaps 5000 hence the rownum < 5000

delete my_table
where  rownum < 5000 
and    type = 'Happy'
and    id not in 
       ( select max_id
         from   ( select max(log_id) max_id
                  ,      object_id
                  ,      type
                  from   my_table
                  where  type = 'Happy'
                  group
                  by     id
                  ,      type
                )
        )

I want to delete happy records but keeping the maximum log id, per object id

I hope that makes sense.

Should I be using some sort of join to improve performance?

2
  • I don't think there is much more left to do. You can use an index to help the sub query perform better (add all fields in the select to an index). Commented Jan 7, 2014 at 16:14
  • In the aggregating query you name object_id and type in the select clause, and id and type n the group by, which is not correct syntax. In any case, you could remove them from the select clause and just have the max(log_id) instead. The "select max_id ..." query level is redundant and can be removed. Commented Jan 7, 2014 at 16:25

3 Answers 3

3

I think this might run faster as a correlated subquery:

Delete
    from my_table
    where type = 'Happy' and
          exists (select 1
                  from my_table t2
                  where t2.object_id = my_table.object_id and
                        t2.type = my_table.type and
                        t2.id > my_table.id
                 );

Then, an index on my_table(object_id, type, id) might also help this query.

Sign up to request clarification or add additional context in comments.

4 Comments

And create a list partition on column type and use /*+ PARALLEL */ hint
@hashbrown are you serious? parallel hint on the table? Does it have any impact on rownum based delete?
@realspirituals .. yes, only if the table is partitioned.. see docs.oracle.com/cd/B10501_01/server.920/a96520/tuningpe.htm
@hashbrown Parallelism does not require partitioning anymore.
0

Since you only care to delete ANY 5000 log entries for type = 'Happy', as long as its not the most recent for any object_id, then you can do something like this:

delete
from my_table
where log_id in (
    select log_id from (
        select log_id, 
            row_number() over (partition by object_id order by log_id desc) rnk
        from my_table
        where typ = 'Happy' 
        and rownum <= 5000
    )
    where rnk > 1
)

This is different from what you have because in your approach, you still need to calculate the max(id) per object across the entire table, which isn't necessary (and log tables can get very large). You just need to make sure you're not deleting the "newest" row (per object) of the 5000 batch rows. Personally, I prefer to setup log tables using partitions, but not everyone has this option.

Hope that helps.

Comments

0

You could simplify the query to:

delete my_table
where  rownum < 5000 
and    type = 'Happy'
and    id not in (select   max(log_id) max_id
                  from     my_table
                  where    type = 'Happy'
                  group by object_id, type)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.