7

I have multiple files which contains key=value string pairs. The keys are the same between the files, but the values differs. Each file can have 1000 plus of such pairs.

I want to store each file in a separate hashmap, ie map<KeyString, ValueString>, so if there are five files, then there will be five hashmaps.

To avoid duplicating the keys across each hashmap, is it possible to have each map reference the same key? Note that once the keys are added to the map, it will not be deleted.

I considered making the first file the 'base' as in the flyweight pattern, this base would be the intrinsic set of keys/values. The other remaining files would be the extrinsic set of values, but I don't know how to relate the values back to the base (intrinsic) keys without duplicating the keys?

I am open to a simpler/better approach.

1
  • Thank you for the suggestions. I decided to go with String pooling, whether with intern() or manual pooling (or none at all if Java is already interning by default). Thanks again. Commented Aug 10, 2017 at 19:12

4 Answers 4

2

I can think about a simpler approach. Instead of having Map<String, String> think of Map<String, List<String> or directly MultiMap<String, String> from guava.

If each key is in each file and all have values, you could store values from first file at 0th index, from the second at 1st index etc.

If it wouldn't work, I recommend a Collection<Map<String, String>, so you're able to iterate through your Maps. Then when you want to add value to one of the Maps, go through all keySets and if one of them contains that key, just put with object returned from this keySet.

Other solution would be to have a HashSet of keys that have already been put. This would be more efficient.

Sign up to request clarification or add additional context in comments.

Comments

1

After reading in the keys, you can use String.intern(). When called, what it does is either:

  • add the String to the internal pool if it didn't exist already;
  • return the equivalent String from the pool if it already existed.

String#intern Javadoc

5 Comments

Nice catch! Didn't know about it!
@kewne plz don't... this will be a nightmare to debug in case something goes wrong. generally intern is highly discouraged
@Eugene I agree that intern should be used in special cases but this seems to be it. Why isn't it appropriate here?
@kewne it's not here, it's potentially everywhere; besides if you really want a single instance - use an enum (like I said in an answer). But if you still want that - declaring as private static final literal would place it in the pool without the need to intern.
@Eugene Using an Enum/constant requires coding for each of the 1000 plus keys. It's not clear from the question that the OP actually wants to do matching on the keys, just that he wants to use flyweights. Using a constant puts the constant in the pool; even if the strings read from the files are equivalent, they are not in the pool themselves. To get the same behavior, you'd have to implement logic equivalent to intern. It's still not clear to me why intern is inappropriate here and what problems it could cause.
1

First of all, I don't see the problem with storing multiple instances of your String keys. 5 HashMaps * 1000 keys is a very small number, and you shouldn't have memory issues.

That said, if you still want to avoid duplicating the Strings, you can create the first HashMap, and then you the exact same keys for the other HashMaps.

For example, suppose map1 is the first HashMap and it is already populated with the contents of the first file.

You can write something like this to populate the 2nd HashMap:

for (String key : map1.keySet()) {
    map2.put (key, someValue);
} 

Of course you will have to find for each key of the first map the corresponding value of the second map. If the keys are not stored in the same order in the input files, this may require some preliminary sorting step.

1 Comment

or an Enum as the Key... in case he really wants that
0

Perhaps you could hold a static Map<> to map your keys to unique Integers and use those Integers for the keys to your map?

Something like:

class KeySharedMap<K,V> {
    // The next key to use. Using Atomics for the auto-increment.
    static final AtomicInteger next = new AtomicInteger(0);
    // Static mapping of keys to unique Integers.
    static final ConcurrentMap<Object,Integer> keys = new ConcurrentHashMap<>();
    // The map indexed by Integer from the `keys`.
    Map<Integer, V> map = new HashMap<>();


    public V get(Object key) {
        return map.get(keys.get(key));
    }

    public V put(Object key, V value) {
        // Associate a unique integer for each unique key.
        keys.computeIfAbsent(key,x -> next.getAndIncrement());
        // Put it in my map.
        return map.put(keys.get(key),value);
    }
}

Yes, I realise that K is not used here but I suspect it would be necessary if you wish to implement Map<K,V>.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.