I want to convert an input stream to byte[] and I'm using IOUtils.toByteArray(inputStream). Will it make more efficient by using a wrapper like BufferedInputStream for the inputStream ? Does it save memory ?
2 Answers
Will it make more efficient by wrapper like BufferedInputStream for the inputStream ?
Not by any significance. IOUtils.toByteArray reads data into a buffer of 4096 bytes. BufferedInputStream uses a 8192 bytes buffer by default.
Using BufferedInputStream does fewer IO reads, but you need a very fast data source to notice any difference.
IF you read an InputStream one byte at a time (or a few bytes), then using a BufferedInputStream really improves performance because it reduces the number of operating system calls by a factor 8000. And operating system calls take a lot of time, comparatively.
Does it save memory ?
No. IOUtils.toByteArray will create a new byte[4096] regardless if whether pass in a buffered or an unbuffered InputStream. A BufferdedInputStream costs a bit more memory to create. But nothing significant.
6 Comments
IOUtils.toByteArray reading whole file at once, read it by batches, e.g. take first 1000 of entities in file, write them to database, commit transaction, read next 1000 then, and you will always need to allocate memory only for one chank of entities, which size you can control.IOUtils.toByteArray will read your file completely at once, so you will hold whole file content in a memory, if you instead will call readLine() and read 1000 lines, then submit them to database, then proceed to next 1000lines in current file, then you will have only 1000 lines of your file in memory at one moment of the time, not the whole file.in terms of final memory consumption it wouldn't help, as you anyway will need to move the whole stream to byte[], the size of the array would be the same, so memory consumption would be the same.
What BufferedInputStream does, it wraps another stream and instead writing to it directly it buffers your input into internal buffer and writes to underlying stream only when it closes/flushes or when the internal buffer is full. It can make your write operations faster, as you will do them in batches instead of writing directly each time, but it wouldn't help if you reading it from another side.