0

I have a simple Object:

public class ActVO
{
   private Long     mFromId;
   private Long     mToId;
   private int      mType;
}

It also has been persisted to Oracle DB. Currently the stored rows has around 1 million in DB. I want to read these all rows to memory and cached to HashMap by using mFromId as key.

My Problem is that it occurred outofmemory error when reading to 400 thousands rows(The start-up memory has been ready allocated to 1G ). I used cern.colt.map.OpenLongObjectHashMap and sun HasHmap it both encountered same problem.

Can everybody tell which 3th Map api or another way can avoid this problem?

3
  • 2
    why the need to cache 1mm rows of data? Commented Mar 12, 2012 at 15:45
  • increase your permGen space? @tbone asks a good question, this won't scale at all... so maybe consider just querying what you need. Commented Mar 12, 2012 at 15:46
  • My program need cache 1mm rows. I has added to 512M But really I cannot add more. Commented Mar 12, 2012 at 15:46

2 Answers 2

2

It will be impossible saving so many object that can not fit the available memory. There are two solutions. First is to use a cache which will persist to a local file objects that can not fit in the memory. Something like ehcache. Second solution is instead of using objects to switch to a two dimensional array

long[][] cache = new long[1000*1000][];
long[] row = new long[2];

row would hold mToId and mType. Rows would be inserted into cache using mFromId as index.

Here's an example:

Random r = new Random();

class ActVO {

  private long mFromId;
  private long mToId;
  private int mType;
}

int capacity = 1000*1000;
List<ActVO> resultSet = new ArrayList<ActVO>();
for (int i = 0; i < capacity; i++) {
  ActVO element = new ActVO();
  element.mFromId = i;
  element.mToId = r.nextLong();
  // let's say there are not more than 10 types
  element.mType = r.nextInt(10);
  resultSet.add(element);

  if (i == 57) {
    System.out.printf("       db result 57: mToId=%d, mType=%d\n", element.mToId, element.mType);
  }
}

long[][] cache = new long[capacity][];

// iterating trough a database set
for (ActVO element : resultSet) {
  long[] row = new long[2];
  row[0] = element.mToId;
  row[1] = element.mType;
  cache[(int) element.mFromId] = row;
}

System.out.printf("57th row from cache: mToId=%d, mType=%d\n", cache[57][0], cache[57][1]);
Sign up to request clarification or add additional context in comments.

3 Comments

But as you can even too many objects, but one single object just includes couple of long it should only use very little of memory. I used arrayList instead. tested with to up 2mm it still have no problem.
Yeah, but the solution with a two dimensional array allows for fast look up using the index as a key.
I cannot understand the two dimensional array. here cache is array how I can using mFromId as index.
0

My suggestions would be;

  • don't use Long when you mean to use long. It can be 5x larger in memory.
  • I would suggest you use TLongObjectHashMap which will make storing the key more efficient.

This will use about 64 bytes per entry. If you need a more compact form, you can do that, but with increasing complexity. You won't do better than 20 bytes per entry.

4 Comments

Thanks for you reply, what is TLongObjectHashMap? is it provided by JDK. I am using JDK1.4. But as you can even too many objects, but one single object just includes couple of long it should only use very little of memory. I do think this issue might caused by indexof hashmap key.
I suggest you google TLongObjectHashMap it will answer your questions. You should be able to find a version old enough to support Java 1.4, but you might have to go back quite a few years. Using long uses 8 bytes, using Long can use 24-32 bytes, they are not the same at all. HashMap doesn't have an indexOf method and the get() doesn't use any memory. But you are right that HashMap could be consuming more memory than your objects. This is why I estimated 64 bytes per entry on a 32-bit JVM. (Possibly 80 bytes on a 64-bit JVM)
Thanks. but I used arrayList instead. tested with up to 2mm it still have no problem. I do guess that hashmap need index for its all key to see whether the same key has been used. that will consume a lot of memory
It needs an additional Map.Entry for every entry. Also its key has to be a Long instead of a long. TLongObjectHashMap has neither of these issues.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.