0

I'm trying to read a giant logfile (250,000 lines), parsing each line into a JSON object, and insert each JSON object to CouchDB for analytics.

I'm trying to do this by creating a buffered stream that will process each chunk seperately, but I always run out of memory after about 300 lines. It seems like using buffered streams and util.pump should avoid this, but apparently not.

(Perhaps there are better tools for this than node.js and CouchDB, but I'm interested in learning how to do this kind of file processing in node.js and think it should be possible.)

CoffeeScript below, JavaScript here: https://gist.github.com/5a89d3590f0a9ca62a23

fs = require 'fs'
util = require('util')
BufferStream = require('bufferstream')

files = [
  "logfile1",
]

files.forEach (file)->
  stream = new BufferStream({encoding:'utf8', size:'flexible'})
  stream.split("\n")
  stream.on("split", (chunk, token)->
    line = chunk.toString()
    # parse line into JSON and insert in database
  )
  util.pump(fs.createReadStream(file, {encoding: 'utf8'}), stream)
3
  • 1
    You should be able to use a stream to feed the file to you. On 'data' events you can pause the stream then split each chunk on "\n". Keep the last item in the split for the next chunk if it doesn't end in a "\n". Make sure to process the remainder when the "end" event is fired. Using bufferstream will run you out of memory, your essentially moving the file in to an array of buffers in memory. Also stream.pipe() should be used instead of util.pump(). Commented Jul 6, 2012 at 21:28
  • I would recommend using fs.createReadStream() - nodejs.org/api/fs.html#fs_fs_createreadstream_path_options Commented Jul 6, 2012 at 21:33
  • 1
    Maybe this could help: https://github.com/nickewing/line-reader Commented Sep 29, 2012 at 22:09

1 Answer 1

2

Maybe this helps: Memory leak when using streams in Node.js?

Try to use pipe() to solve it.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.