Memory usage of a recursive array in Haskell

Question

I have a dynamic programming algorithm and I find my Haskell implementation very satisfying, as it allows me to define the master array recursively, like so:

fill :: Int -> Int -> Entry
fill 0 0 = Entry Origin 0.0
fill 0 j = ...
fill i 0 = ...
fill i j = f m i j

m :: Array (Int, Int) Entry
m = listArray dims [fill i j | (i, j) <- range dims]

where f is a moderately complex function referring back to the main array m. The Entry type is just a Double with a small annotation.

The data itself is quite large, and m ends up having on the order of 100M entries. The algorithm is still blazing fast, but it uses ~25GB RAM in the process. From reading around I understand that this is because it keeps around the full computation of the array entries instead of the final value, but if I switched to unboxed arrays, I could not define them recursively like in the example above.

Is there a way to have a cake (low memory footprint) and eat it (recursive definition)?

According to your data, it consumes roughly 250B / entry. Perhaps you could try improving that by making your Entry type stricter, e.g. by adding strictness annotation and/or unpacking. You can't avoid the boxing of Entry itself and use an unboxed array, but you probably can remove boxing from the components of Entry. How is that type defined? — chi
– chi, Commented Dec 9, 2018 at 15:38
Are you definitely using the entire matrix for your computation, or will there be some unevaluated entries? A lot? — luqui
– luqui, Commented Dec 9, 2018 at 23:57
Would it be okay with you if you used 25GB temporarily -- e.g. during a precomputation phase at program startup or perhaps even before -- as long as the algorithm itself uses less during its run? — Daniel Wagner
– Daniel Wagner, Commented Dec 10, 2018 at 13:02
Thanks everyone for your replies. @chi data Entry = Entry Source Double, data Source = Origin | Left | Right | Top | Bottom. So super simple. I believe that unboxing the Double part could help, because that's the recursively defined part. @luqui I'd estimate that maybe half of the entries are actually used, and memory profiling supports that. @DanielWagner I'm curious what you have in mind. In principle, 25 GB at any stage is a bit heavy, since I'd like to eventually run this on smaller machines, but I'm definitely listening. — Marek
– Marek, Commented Dec 10, 2018 at 18:56
I'd definitely try data Entry = Entry {-# UNPACK #-} !Source {-# UNPACK #-} !Double for such a basic type. Don't forget to turn on optimizations with -O (and to avoid GHCi for measuring performance) — chi
– chi, Commented Dec 10, 2018 at 19:14

Marek · Accepted Answer · 2018-12-26 18:03:59Z

1

In the end, I got reasonable results with unboxed mutable array, like so:

import Data.Array.ST as UM
import Control.Monad (forM_)

mbounds = ((0,0), (..., ...))
m = UM.runSTUArray $ do
    arr <- UM.newArray_ mbounds
    let set = UM.writeArray arr
    let get = UM.readArray arr
    let fill (0, 0) = do set (0, 0) 0.0
        fill (i, 0) = do top <- get (i-1, 0)
                         set (i, 0) (f top)
        fill (0, j) = do ...
        fill (i, j) = do ...
    forM_ (UM.range mbounds) fill
    return arr

Still quite a bit more RAM gets hogged than I would like (~4GB), but it's a huge improvement and I suspect that after I re-run the profiler, I'm going to find ways to cut it down further.

answered Dec 26, 2018 at 18:03

Marek

1313 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Memory usage of a recursive array in Haskell

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related