I have thousands of files (50K) and each file has around 10K lines.I read the file do some processing and write the lines back to an output file. While my reading and processing is way faster, the final step to convert the String Iterator back to a single String and write it to a file take a long time(almost a second.I wouldn't do the math for doing this for the whole population of files which is around 50K). I see this to be the bottleneck in the of improving my parsing time.
This is my code.
var processedLines = linesFromGzip(new File(fileName)).map(line => MyFunction(line))
var outFile = Resource.fromFile(outFileName)
outFile.write(processedLines.mkString("\n")) // severe overhead caused by this line-> processedLines.mkString("\n")
( I read on few other forums/blogs that mkString is much better than other approaches. (eg.)
Is there a better alternative to mkString("\n") ? Is there a totally different approach that would increase my speed of processing files. (remember, I have 50K files of each close to 10K lines).