4

I am implementing a tree-like structure using the Map interface like the following declaration:

Map<String, Map<String, Map<Integer, Double>>>

Currently I am using the HashMap implementation. After loading a huge amount of data, I am seeing the program consume 4GB of RAM. On persisting the whole entity using the Serializable interface, the resulting file's size is just 1GB.

What is the most memory-efficient Map implementation that I could use here?

5
  • 1
    Is using maps the right solution? Shouldn't you use a List<FirstLevelNode>, with FirstLevelNode holding a List<SecondLevelNode>, and SecondLevelNode holding a List<ThirdLevelNode>? Commented Nov 25, 2012 at 14:33
  • Wont using list affect the performance of retrieval. I am fine with larger load time but retrieval time is what i am trying to save here. Commented Nov 25, 2012 at 15:01
  • Maybe. We don't know what you're doing with your tree. Commented Nov 25, 2012 at 15:05
  • It is strange to call this structure a tree. It is indeed tree-shaped, assuming that none of the values in the map are coupled with more than key. Otherwise, you'd have a graph. In order to give you the best answer, you need to describe the access pattern for this structure. Do you usually have two strings and an integer in hand for which you want to find the corresponding double value? Or do you need to grab subtrees (say, given just the first string) and pass those around as well? Restated: Is this really a mapping from a composite key (a tuple of two strings and an integer) to a double? Commented Nov 25, 2012 at 15:06
  • All i want is to map a (String,String,Integer) -> Float . As there is a large volume of such data , its very important to achieve the most efficient method here. Commented Nov 25, 2012 at 16:02

2 Answers 2

4

If you want to map a (String,String,Integer) to a Float, then the best thing to do is to use a Map<MyKey, Float>, where MyKey would be defined like this:

public final class MyKey {
    private final String a;
    private final String b;
    private final Integer c;

    public MyKey(String a, String b, Integer c) {
        this.a = a;
        this.b = b;
        this.c = c;
    }

    // getters, if needed

    @Override
    public int hashCode() {
        return Objects.hash(a, b, c);
    }

    @Override
    public boolean equals(Object o) {
        if (o == this) {
            return true;
        }
        if (!(o instanceof MyKey)) {
            return false;
        }
        MyKey other = (MyKey) o;
        return Objects.equal(a, o.a)
               && Objects.equal(b, o.b)
               && Objects.equal(c, o.c);
    }
}
Sign up to request clarification or add additional context in comments.

6 Comments

That is a correct way to do it, but it does not address the OP's question as to which way is most efficient in terms of memory consumed by the structure. Here we add the overhead of a four object headers per key. There's another design that would use just two headers per key: a dummy type wrapped around a byte array.
+1 for this solution over @seh's. The real issue is avoiding the overhead incurred by the three nested maps; fancier approaches require significantly more work for minimal benefit.
Agreed. But it would already be a lot more efficient than maps of maps of maps. I would go with a clear and simple solution first, and see if it needs additional optimization only after.
Well, sure, but the OP didn't ask whether he should care about such things. He said that he does, and I take him at his word that he'd like to learn more about storage overhead through our answers (and, in this case, our probing questions too).
@seh - I cant see how this is a solution. Here you are mapping (String,String,Integer) to a unique hashcode of integer type. This might not be possilbe in my case as there is a huge volume of such data and 2^32 integers wont be able to represent them all. My volume of data would be much higher than that.
|
3

You have two kinds of maps here. One which has String keys and Map values. For that I'd probably use Google Guava's ImmutableMap if immutability is ok for you. It will probably not save you a lot of memory, but it might save you some, and perform a bit better than a normal HashMap.

For the other Map with Integer keys and Double values, you should use a specialized Map implementation which stores primitives instead of objects. Take for instance a look at Trove4j's TIntDoubleHashMap. This will save you a lot of memory, as the primitives are stored as primitives instead of Objects.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.