6

I have a situation where I need to take a stream and chunk it up into Buffers. I plan to write an object transform stream which takes regular input data, and outputs Buffer objects (where the buffers are all the same size). That is, if my chunker transform is configured at 8KB, and 4KB is written to it, it will wait until an additional 4KB is written before outputting an 8KB Buffer instance.

I can choose the size of the buffer, as long as it is in the ballpark of 8KB to 32KB. Is there an optimal size to pick? The reason I'm curious is that the Node.js documentation speaks of using SlowBuffer to back a Buffer, and allocating a minimum of 8KB:

In order to avoid the overhead of allocating many C++ Buffer objects for small blocks of memory in the lifetime of a server, Node allocates memory in 8Kb (8192 byte) chunks. If a buffer is smaller than this size, then it will be backed by a parent SlowBuffer object. If it is larger than this, then Node will allocate a SlowBuffer slab for it directly.

Does this imply that 8KB is an efficient size, and that if I used 12KB, there would be two 8KB SlowBuffers allocated? Or does it just mean that the smallest efficient size is 8KB? What about simply using multiples of 8KB? Or, does it not matter at all?

1 Answer 1

3
+200

Basically it's saying that if your Buffer is less than 8KB, it'll try to fit it in to a pre-allocated 8KB chunk of memory. It'll keep putting Buffers in that 8KB chunk until one doesn't fit, then it'll allocate a new 8KB chunk. If the Buffer is larger than 8KB, it'll get its own memory allocation.

You can actually see what's happening by looking at the node source for buffer here:

if (this.length <= (Buffer.poolSize >>> 1) && this.length > 0) {
  if (this.length > poolSize - poolOffset)
    createPool();
  this.parent = sliceOnto(allocPool,
                          this,
                          poolOffset,
                          poolOffset + this.length);
  poolOffset += this.length;
} else {
  alloc(this, this.length);
}

Looking at that, it actually looks like it'll only put the Buffer in to a pre-allocated chunk if it's less than or equal to 4KB (Buffer.poolSize >>> 1 which is 4096 when Buffer.poolSize = 8 * 1024).

As for an optimum size to pick in your situation, I think it depends on what you end up using it for. But, in general, if you want a chunk less than or equal to 8KB, I'd pick something less than or equal to 4KB that will evenly fit in to that 8KB pre-allocation (4KB, 2KB, 1KB, etc.). Otherwise, chunk sizes greater than 8KB shouldn't make too much of a difference.

Sign up to request clarification or add additional context in comments.

4 Comments

To clarify, Buffer.poolSize = 8 * 1024 so it should be "less than or equal to 8KB", right? not 4KB? Here is the code of concern where I am using buffers: codereview.stackexchange.com/q/57492/12199 After this transform stream, I will be writing these chunks, interleaved with other data, down a network stream.
Yes, Buffer.poolSize is 8KB, but the if statement there is effectively saying if (this.length <= (Buffer.poolSize / 2) && this.length > 0), so it will only add the Buffer to the pre-allocated one if this.length is 4KB or less (and greater than 0).
Permalink to the code: github.com/nodejs/node/blob/….
Now the if-statement is if (size < (Buffer.poolSize >>> 1)), so it would make use of the pool when the length is less than 4 KiB.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.