Hello fellow coders. So i decided to rewrite some of my old scripts i had lying around in haskell just because i need the practice and i like the language. So here i am trying to filter a huge file (around 1.7 GB) , cut the lines of no interest and write the remaining stuff in another file.
I thought that haskell's lazy nature would be ideal for this but the code keeps running out of memory too soon. The previous versions (c# or Python) had a read line -> write line approach but i tried a different approach here. Should i just rewrite the code to mirror the previous version or am i missing something.
So this is the function in charge of the original file filtering:
getLines :: FilePath -> IO [[String]]
getLines path = do
text<-readFile path
let linii=lines text
let tokens = map words linii
let filtrate=[x|x<-tokens,length x>7,isTimeStamp (x!!0),isDiagFrame x]
return filtrate
this one is in charge of writing one line at a time in the new file (altho i tried to use writeFile dirrectly and failed miserably :) :
writeLines ::Handle->[[String]]->IO ()
writeLines handle linii = do
let linie=concat $ intersperse " " (head linii)
hPutStrLn handle linie
if length linii > 0 then
writeLines handle (tail linii)
else
print "Writing complete..."
and these 2 are the main function and another one in charge of geting the handle and passing it around :
writeTheFile :: FilePath->FilePath->IO ()
writeTheFile inf outf = do
handle<-openFile outf WriteMode
linii<-getLines inf
writeLines handle linii
print "Write Complete"
main = do
arg<-getArgs
if length arg/=2 then
print "Use like this : trace_pars [In_File] [Out_File] !"
else
writeTheFile (arg!!0) (arg!!1)
Any advice would be greatly appreciated...thanks in advance