2

I am working on an application which reads in a huge amount of data from a database into a Map<String,Map<String,Map<String,String>>>, processes it, and writes the processed reports to a spreadsheet using an in-house xml writer. The whole run can take about 12 hours.

I'm finding I'm getting

Exception in thread "CursorController-Thread-0" java.lang.OutOfMemoryError: Java heap space
    at java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:45)
    at java.lang.StringBuilder.<init>(StringBuilder.java:68)

When I attempt to write this jumbo file. For this reason I think it would be best to write each Map<String,Map<String,String>> (notice that's a layer deeper)as it's finished processing.

My question is, how can I make sure that the Map<String,Map<String,String>> is not retained in memory after I write it, since the Map>> will still contain it?

5 Answers 5

7

Once you're done with the Map<String,Map<String,String>> mapped to by the key "key" you simply do

hugeMap.remove("key");

This will "null" out the entry in the hugeMap and make the Map<String,Map<String,String>> eligible for garbage collection (i.e., never be part of causing a heap space out of memory).

Sign up to request clarification or add additional context in comments.

1 Comment

Just what I was looking for! Thanks for such a quick answer.
2

I would choose a different solution for this kind of problem. 12 hours for processing the source data is heavy.

Have you considered any scalable solutions? For e.g. Hadoop?

4 Comments

Another solution might be SpringBatch.
I'd love to use a solution like that but this is a report a single mid level manager wants. The entire department only has ~20 computers and they're taken up by engineering work. You are 100% right though that a more scalable solution than a single workstation processing tens of millions of rows is best! Just not possible for me.
You don't really need that much computer. The MapReduce approach could help also.
I'll look into that more and consider shifting long term development in that direction. Thanks for the tip. Unfortunatly a lot of processing time comes from an unindexed 18 million row table!
0

Use map.remove(key) method on your Map>>. You can call from time to time System.gc(); to force garbage collection.

2 Comments

System.gc won't force garbage collection. It will hint the JVM that it may be a good idea to run GC.
0

You can keep the Map> that was written in the outer map if you want to keep your structure. but probably you should clear its contents so that it is empty. also, make sure that when you are processing it and writing it, you don't keep references to its members (mappings) anywhere before clearing the content. please see the following post for picking the approach that suits your needs best Using .clear() or letting the GC take care of it

Comments

-1

You can't.

The garbage collector runs whenever it likes and frees whatever it likes.

That said, it is worth trying that after you delete all references to the data you no longer need, call to System.gc().

Anyway, you have written that the out of memory error is while writting the data. Maybe you have a memory leak there.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.