What structures to use to create constant access time immutable table in Scala?

Question

Does Scala have any sort of classes that can deal with this? I was considering using an Array[Array[_]], since it would fit my needs quite well.

What other options do I have? Using a Map[(Int,Int),_] will be too slow, I believe, since it has linear access time.

Well, HashMap (Map[ ] ) has constant time if you have a good hash. — anrizal - Anwar Rizal
– anrizal - Anwar Rizal, Commented Aug 5, 2011 at 11:12
The default Map has constant time access, because it is based on a hash table. Do not presume on a collection performance -- either test, or look up the documentation available on it. — Daniel C. Sobral
– Daniel C. Sobral, Commented Aug 5, 2011 at 23:37

Kipton Barros · Accepted Answer · 2011-08-05 17:30:44Z

4

Access in Map implementations (besides ListMap) is certainly not linear (very small maps, with less than 4 elements, do indeed linear search, but that is a very fast implementation for so small a map, that is O(N) with a very small N, and a very small k factor too. But when the table grows, a different algorithm is used, asymptotically much faster, typically O(log(N)) or O(1).

Yet, if what you need is a table/matrix, in the sense that it contains data for keys (i,j) satisfying 0 <= i < m, 0 <= j < n, (starting at 1 would be fine too) a general map may not be the best choice. All the more if the table is full, that is all (i,j) have values. Then something Array based, either Array[Array[_]] or even a single array could be much better. If you want something that specialized, maybe you will not want to implement the full Map interface.

edited Aug 5, 2011 at 17:30

Kipton Barros

21.1k4 gold badges70 silver badges81 bronze badges

answered Aug 5, 2011 at 11:15

Didier Dupont

29.6k7 gold badges73 silver badges90 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Nicolas Over a year ago

No, access to a Map is not "certainly not linear".

Kipton Barros Over a year ago

Hi didierd, I like your answer and expanded on it a little. One comment: mathematically, it doesn't make sense to say O(N) for small N, because the big-O asymptotic notation is only meaningful when N is large, and we're always ignoring constant pre-factors. In a sense, all performance characteristics of small data structures can get swallowed in these prefactors.

Didier Dupont Over a year ago

@Nicholas. Maybe I did not say it right. What I meant is that is much better than linear. I'll make a change.

Didier Dupont Over a year ago

@Kipton. Well you a linear search algorithm, which is O(N), and apply it only to small tables. But indeed, the point then is not that it is O(N) it is that it is kxN and fast.

Kipton Barros Over a year ago

@didierd I think we agree, but to clarify my point, Big-O is a precise mathematical notation that only pertains to the limit N goes to infinity. en.wikipedia.org/wiki/Big_O_notation

Brian Agnew · Accepted Answer · 2011-08-05 11:59:03Z

4

HashMap almost do it. This Scala collection performance page can be very useful.

edited Aug 5, 2011 at 11:59

Brian Agnew

273k38 gold badges342 silver badges443 bronze badges

answered Aug 5, 2011 at 11:05

Nicolas

24.8k5 gold badges62 silver badges67 bronze badges

Comments

Kipton Barros · Accepted Answer · 2011-08-05 15:46:10Z

For utmost performance you will probably want to avoid the garbage collector (GC) as much as possible. This means not creating tuple objects (i, j) and not boxing primitives. If your table will be fully populated (not sparse), then your idea of an Array of Arrays is reasonable. For best performance, however, I'd go with didierd's idea and pack everything into a single array, with accessors. Here's a possible implementation:

// specialize A to avoid boxing primitives such as Int, Double, ...
class Table[@specialized A: Manifest](numRows: Int, numCols: Int) {
  val data = new Array[A](numRows*numCols)
  def checkIndices(i: Int, j: Int) {
    require (i >= 0 && i < numRows && j >= 0 && j < numCols)
  }
  def apply(i: Int, j: Int): A = {
    checkIndices(i, j)
    data(i*numCols + j)
  } 
  def update(i: Int, j: Int, x: A) {
    checkIndices(i, j)
    data(i*numCols + j) = x
  } 
}

You can use it naturally like this (note the nice syntax you get from the special apply and update methods):

scala> val table = new Table[Double](2, 2)
table: Table[Double] = Table$mcD$sp@4d22366e

scala> table(0, 0) = 3.0

scala> table(1, 1) = table(0, 0) + 2.0

scala> table(2, 1)
java.lang.IllegalArgumentException: requirement failed
...

On the other hand, if the table is sparse, then it might better to go with a Map to save memory for all the empty entries. Note that although Map has fast update and apply methods (nearly constant) they're still quite a bit slower than array accesses (mostly due to pressure on the GC; Map is not specialized, both the keys and values must be heap allocated).

Collectives™ on Stack Overflow

What structures to use to create constant access time immutable table in Scala?

3 Answers 3

5 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

5 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related