1

All,

I have to redesign an existing logging system being used in web application. The existing system reads an Excel sheet for records, processes(data validation) it, records the error messages for each entry in the Excel sheet into the database as soon as an error is found and displays the result in the end for all the records. So,

If I have 2 records in the excelsheet, R1 and R2, both fail with 3 validation error each, an insert query is fired 6 times for each validation message and the user sees all the 6 messages in the end of the validation process.

This method worked for smaller set of entries. But for 20,000 records, this obviously has become a bottleneck.

As per my initial redesign approach, following are the options I need suggestion on from everyone at SO:

1> Create a custom logger class with all the required information for logging and for each record in error, store the record ID as key and the Logger class object as value in a HashMap. When all the records are processed completely, perform database inserts for all the records in the HashMap in one shot.

2> Fire SQL inserts periodically i.e. for X records in total, process Y <= X records each time, perform insert operation once. and processing remaining records again.

We really do not have a set criteria at this point except for definitely improving the performance.

Can everyone please provide your feedback as to what would be an efficient logging system design and if there are better approaches than what I mentioned above ?

3
  • do you need to store the errors in the database ? if it is not a requirement how about dumping the errors to a file. Commented Oct 27, 2011 at 7:11
  • That is a good approach. Do you think my above approaches would make a significant impact in performance? i.e. firing SQL query once rather than N times. Commented Oct 27, 2011 at 13:57
  • are you sure the inserts are causing the bottleneck? Commented Oct 27, 2011 at 18:49

2 Answers 2

2

I would guess your problems are due to the fact you are doing row based operations, rather than set based ?

A set based operation would be the quickest way to load the data. If that is not possible I would go with the insert x records at a time as it is more scalable , inserting them all at once would require ever increasing amounts of memory (but would probably be quicker).

good discussion here on ask tom: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1583402705463

Sign up to request clarification or add additional context in comments.

Comments

1

Instead of memorizing every error in a HashMap, you could try (provided the DBMS supports it) to batch all those insert statements together and fire it at the end. Somewhat like this:

PreparedStatement ps = connection.prepareStatement("INSERT INTO table (...) values (?, ?, ...)");
for(...) {
   ps.setString(1, ...);
   ...
   ps.addBatch();
}
int[] results = ps.executeBatch();

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.