Processing a big list from DB in Java

Question

I have a big list of over 20000 items to be fetched from DB and process it daily in a simple console based Java App.

What is the best way to do that. Should I fetch the list in small sets and process it or should I fetch the complete list into an array and process it. Keeping in an array means huge memory requirement.

Note: There is only one column to process.

Processing means, I have to pass that string in column to somewhere else as a SOAP request. 20000 items are string of length 15.

Jeff Storey · Accepted Answer · 2012-06-19 04:23:53Z

1

It depends. 20000 is not really a big number. If you are only processing 20000 short strings or numbers, the memory requirement isn't that large. But if it's 20000 images that is a bit larger.

There's always a tradeoff. Multiple chunks of data means multiple trips to the database. But a single trip means more memory. Which is more important to you? Also can your data be chunked? Or do you need for example record 1 to be able to process record 1000.

These are all things to consider. Hopefully they help you come to what design is best for you.

answered Jun 19, 2012 at 4:23

Jeff Storey

57.4k75 gold badges245 silver badges414 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Akhil K Nambiar Over a year ago

20000 string of length 15each. multiple trips is not an issue. data can be chunked as all are independent.

Jeff Storey Over a year ago

At 16bit (2 bytes) * 15 chars per string * 20000 strings that's only about 600kb

Shahzeb Over a year ago

@AkhilKNambiar data size in your case is not not big enough for you to sweat about . Just jam it in appropriate data structure e.g ArrayList. I would rather avoid multiple trips in your case.

Stephen C Over a year ago

@JeffStorey - its a bit more than that ... since each String comprises 2 Java objects complete with headers and private fields. Still, it should fit into a default sized heap with no problems.

Jobless Programmer · Accepted Answer · 2012-06-19 04:23:47Z

0

Correct me If I am Wrong , fetch it little by little , and also provide a rollback operation for it .

answered Jun 19, 2012 at 4:23

Jobless Programmer

1862 silver badges16 bronze badges

Comments

CloudyMarble · Accepted Answer · 2012-06-19 04:27:41Z

0

If the job can be done on a database level i would fo it using SQL sripts, should this be impossible i can recommend you to load small pieces of your data having two columns like the ID-column and the column which needs to be processed.

This will enable you a better performance during the process and if you have any crashes you will not loose all processed data, but in a crash case you eill need to know which datasets are processed and which not, this can be done using a 3rd column or by saving the last processed Id each round.

answered Jun 19, 2012 at 4:27

CloudyMarble

37.7k70 gold badges103 silver badges133 bronze badges

Collectives™ on Stack Overflow

Processing a big list from DB in Java

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related