4

What is the best bet when a scala Map is created and used in a single thread? (Like the java analogous best bet on StringBuffer compared with StringBuilder for constructing strings).

The kind of constraints to choose the Map are:

  • the map is created once with multiple additions/updates on possible already existing keys
  • nothing is deleted from map - so this kind of operation might not be needed
  • map could have thousands of values (not so big)
  • the usage is from the same thread (so there is no concern about accessing/updating the map in parallel)
  • the signature of map might be Map[String,T]. In case there is a reason a Map[Int/Long,T] could be used.

I investigated

  • collection.immutable.Map (that initially works optimized for few keys)
  • collection.immutable.HashMap
  • collection.mutable.OpenHashMap
  • collection.mutable.HashMap

The test showed that there is no clear winner with 50000 keys. I found some

Nevertheless the question is what is the safest bet in general on such circumstances and why?

1
  • 2
    Unless you have some reasonably specific workload in mind or you write for some environment with specific constraints (low memory, only one processor) it sounds like your research has already turned up the fact that in the general case any of the Map implementations will work well enough that they won't be a bottleneck. How would you define "best"; smallest eC, least memory for N keys, smallest amount of garbage generated for inserts, fewest context switches when performing operations A -> B -> C, some combination of the above or something else entirely? Commented Mar 15, 2015 at 19:56

1 Answer 1

9

If your map is not extremely tiny and not huge, and your key is a String, then collection.mutable.AnyRefMap is a good bet. collection.mutable.LongMap is even faster if you can have Long keys. The reason they exist is precisely to be fast for common use cases.

If most maps are incredibly tiny (0-4 elements), then LinkedHashMap tends to be best because it avoids the overhead of a hash table. (Immutable maps are also not bad when 4 or fewer elements.)

If maps are really huge (100s of millions of key/value pairs) then the standard collection.mutable.HashMap is the way to go because performance degrades slightly more gracefully as you run out of space for separate keys.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.