I want to create Node.js module which should be able to parse huge binary files (some larger than 200GB). Each file is divided into chunks and each chunk can be larger than 10GB. I tried using flowing and non-flowing methods to read file, but the problem is because the end of the readed buffer is reached while parsing chunk, so parsing of that chunk must be terminated before the next onData event occurs. This is what I've tried:
var s = getStream();
s.on('data', function(a){
parseChunk(a);
});
function parseChunk(a){
/*
There are a lot of codes and functions.
One chunk is larger than buffer passed to this function,
so when the end of this buffer is reached, parseChunk
function must be terminated before parsing process is finished.
Also, when the next buffer is passed, it is not the start of
a new chunk because the previous chunk is not parsed to the end.
*/
}
Loading whole chunk into process memory isn't prossible because I have only 8GB of RAM. How can I synchronously read data from the stream or how can I pause parseChunk function when the end of the buffer is reached and wait until new data is available?