Running into a space overflow when trying to run this code (I've commented out the changes I've already tried):
{-# LANGUAGE BangPatterns #-}
import System.IO (hFlush, stdout)
import System.Environment (getArgs)
-- import Data.List (foldl')
import qualified Data.Map as Map
-- import qualified Data.Map.Strict as Map
-- import qualified Data.ByteString.Char8 as B
data Trie = Trie { isWord :: Bool, children :: Map.Map Char Trie }
initial :: Trie
initial = Trie False Map.empty
insertWord :: String -> Trie -> Trie
insertWord [] trie = trie { isWord = True }
insertWord (c:cs) trie = trie { children = Map.insert c child $ children trie }
where
child = maybe (insertWord cs initial) (insertWord cs)
(Map.lookup c (children trie))
-- insertWord :: String -> Trie -> Trie
-- insertWord [] trie = trie { isWord = True }
-- insertWord (!c:(!cs)) trie = trie { children = Map.insert c child $ children trie }
-- where
-- child = let a = maybe (insertWord cs initial) (insertWord cs)
-- (Map.lookup c (children trie))
-- in seq a a
fromWords :: [String] -> Trie
fromWords = foldr insertWord initial
-- fromWords :: [String] -> Trie
-- fromWords = foldl' (flip insertWord) initial
toWords :: Trie -> [String]
toWords = concatMap results . Map.toList . children
where
results (c, t) = (if isWord t then ([c]:) else id)
. map (\str -> c:str) $ toWords t
completions :: String -> Trie -> [String]
completions [] trie = toWords trie
completions (c:cs) trie = maybe [] (map (c:) . completions cs)
(Map.lookup c $ children trie)
main :: IO ()
main = do
[prefix] <- getArgs
dict <- readFile "/usr/share/dict/words"
mapM_ putStrLn (completions prefix (fromWords $ lines dict))
-- dict <- B.readFile "/usr/share/dict/words"
-- mapM_ putStrLn (completions prefix (fromWords $ map (B.unpack) $ B.lines dict))
Output:
$ ./trie abba
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
The output from "+RTS -h": https://i.sstatic.net/5BpU1.png
I can get the code to work if I specify "+RTS -K1G". I'd really appreciate if someone could point me in the right direction.
foldl'approach -- you just need to make surechildrenis forced when aTrieis; i.e. make thechildrenfield inTriestrict.data Trie = Trie { isWord :: Bool, children :: !(Map.Map Char Trie) }. So, a load ofMap.insert's were building up without getting evaluated?child = seq a ais a code smell, since it is equivalent tochild = a: it does not causechildto be more strict. This is because ifchildis evaluated, thenais forced in both cases. Ifchildis not evaluated, thenais not forced in both cases: the extraseqdoes not even get a chance to run.