1

I need to delete entries from a table based on a key, and these entries might come to around 5 million. There are two ways to go about it. One using Hibernate and the other by direct SQL query.

Hibernate:

List<Employee> empList = 
            getHibernateTemplate()
            .findByNamedParam("from Employee emp where emp.Id=:empId","empId",employeeId);


        Iterator<Employee> empIter = empList .iterator();

        while(empIter.hasNext()) {
            Employee empTran = empIter.next();
            getHibernateTemplate().delete("Employee", empTran);

SQL:

delete from Employee where Id = employeeId";

Which among the two will get the results faster? And can the Hibernate query be tuned further?

(Please ignore syntax errors if any)

2
  • I guess you're deleting by primary key? If yes than why are you using lists? Commented Feb 23, 2012 at 13:08
  • Be careful here, since the SQL statement will NOT remove entries from the first and second level cache! You can end up with data inconsistency. Commented Feb 23, 2012 at 14:04

1 Answer 1

2

Second one definitely will be faster with a significant difference because in the first one you execute a query for every deletion but in the second you only send one query and your DBMS handles the rest.

You can execute the second one through Hibernate, HSQL.

Edit: Btw, if you run your "DELETE FROM" query for every ID it will be slow almost as the first one, except you won't be iterating whole Employee records, which is better :) Use SQL's IN() operator

"delete from Employee where Id IN(3,5,8,...);"
"delete from Employee where Id IN(SELECT Id FROM table...);"

Try to minimize your SQL Query executions and after if you are still not satisfied with the performance try to improve the performance by the improving the query itself, not the programming part.

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks. Do you think using HSQL to execute the second one would be better than using SQL ?
Executing that query statement via Hibernate or JDBC actually does not make a tangible difference. The only difference is Hibernate may be 1-2 miliseconds slower because it parses the query itself too then it sends the query to JDBC, so that's one more layer and more function calls. But it's worth it and it's constant, that means whether you are deleting 1 record or 1 million records in a set/table of 10 records or 1 billion records that time doesn't change, but only the time for your DBMS to make deletion will change. Also you should make your ID primary key, that's very important.
Thanks . It helped me a lot. Will implement it in HSQL. Cant make it primary key though. You woudn't have a million records with the same primary key! Actually I have another primary key. Anyway a billion thanks
I've added some extra information to make my answer more accurate, please check it out.
I will not be using more than 1 id in my query. Anyway thanks for the update.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.