96

What is the difference between a byte array & byte buffer ?
Also, in what situations should one be preferred over the other?

[my usecase is for a web application being developed in java].

3 Answers 3

107

There are actually a number of ways to work with bytes. And I agree that it's not always easy to pick the best one:

  • the byte[]
  • the java.nio.ByteBuffer
  • the java.io.ByteArrayOutputStream (in combination with other streams)
  • the java.util.BitSet

The byte[] is just a primitive array, just containing the raw data. So, it does not have convenient methods for building or manipulating the content.

A ByteBuffer is more like a builder. It creates a byte[]. Unlike arrays, it has more convenient helper methods. (e.g. the append(byte) method). It's not that straightforward in terms of usage. (Most tutorials are way too complicated or of poor quality, but this one will get you somewhere. Take it one step further? then read about the many pitfalls.)

You could be tempted to say that a ByteBuffer does to byte[], what a StringBuilder does for String. But there is a specific difference/shortcoming of the ByteBuffer class. Although it may appear that a bytebuffer resizes automatically while you add elements, the ByteBuffer actually has a fixed capacity. When you instantiate it, you already have to specify the maximum size of the buffer.

That's one of the reasons, why I often prefer to use the ByteArrayOutputStream because it automatically resizes, just like an ArrayList does. (It has a toByteArray() method). Sometimes it's practical, to wrap it in a DataOutputStream. The advantage is that you will have some additional convenience calls, (e.g. writeShort(int) if you need to write 2 bytes.)

BitSet comes in handy when you want to perform bit-level operations. You can get/set individual bits, and it has logical operator methods like xor(). (The toByteArray() method was only introduced in java 7.)

Of course depending on your needs you can combine all of them to build your byte[].

Sign up to request clarification or add additional context in comments.

9 Comments

your answer is very helpful
It is categorically wrong to say ByteBuffer to byte[] is like StringBuilder to String. No it is not. ByteBuffer to byte[] is more like FancyString to String. It does NOT have the builder capability StringBuilder offer. As you mentioned, it is fixed-size and no way to grow it. The true reasons for why we need ByteBuffer are two fold: 1) Direct memory access. This means when used in direct mode, ByteBuffer can bypass the JVM garbage collection, and the memory used by ByteBuffer is completely outside the JVM memory pool. 2) Interfacing, as byte[] is primitive and cannot be used in OO way.
@jzl106 I was a bit reluctant to answer to your remark, because there is a maximum length to comments. Your remark does touch a lot of aspects at once. Which makes it hard to answer all of them at once. And it also rephrases some of the things I said, leaving out important details, and taking them out of context.
@jzl106 (part 2): No, I never said that ByteBuffer IS a builder by the strict definition of what a builder is. I just said that it is LIKE a builder. There are some similarities to what a builder is and how it can be used. e.g. The way you can chain methods together, is typical for a builder. Typically that series ends with a flip() which resembles how people would chain together function and then call a build().
@jzl106 (part 3): I did NOT say that ByteBuffer is like a StringBuilder. In fact, I said the opposite. But I guess for you it is just not clear why I compared it with StringBuilder at all. Let me explain: Before StringBuilder, people would add strings together with + operator, which resulted in many array copying. Similarly you can't add 2 byte[] together without copying them. And some people would be TEMPTED to think that a ByteBuffer is the ultimate solution. But it is NOT, for the reasons which I explained.
|
28

ByteBuffer is part of the new IO package (nio) that was developed for fast throughput of file-based data. Specifically, Apache is a very fast web server (written in C) because it reads bytes from disk and puts them on the network directly, without shuffling them through various buffers. It does this through memory-mapped files, which early versions of Java did not have. With the advent of nio, it became possible to write a web server in java that is as fast as Apache. When you want very fast file-to-network throughput, then you want to use memory mapped files and ByteBuffer.

Databases typically use memory-mapped files, but this type of usage is seldom efficient in Java. In C/C++, it's possible to load up a large chunk of memory and cast it to the typed data you want. Due to Java's security model, this isn't generally feasible, because you can only convert to certain native types, and these conversions aren't very efficient. ByteBuffer works best when you are just dealing with bytes as plain byte data -- once you need to convert them to objects, the other java io classes typically perform better and are easier to use.

If you're not dealing with memory mapped files, then you don't really need to bother with ByteBuffer -- you'd normally use arrays of byte. If you're trying to build a web server, with the fastest possible throughput of file-based raw byte data, then ByteBuffer (specifically MappedByteBuffer) is your best friend.

3 Comments

It is not the Java security model that is the limitation. It is the JVM architecture that prevents you from casting bytes to typed data.
The security model also affects the usability of ByteBuffer -- at least in my testing which is a few years old now. Every time you call one of the cast functions in the ByteBuffer class, SecurityManager code gets executed, which slows the whole process down. This is why regular java io functions are generally faster for reading in java basic types. This contrasts with C, where memory mapped files with a cast are much, much faster than using stdio.
Looking at the code, the security manager calls only appear to occur in DirectByteBuffer case. I think it happens because the method is using Unsafe.
3

Those two articles may help you http://nadeausoftware.com/articles/2008/02/java_tip_how_read_files_quickly and http://evanjones.ca/software/java-bytebuffers.html

2 Comments

I can not reproduce the first link's conclusion that FileChannel is relevantly faster than FileInputStream for reading into byte[]. I suspect that since they use a file of length 100MB, they actually benchmark reading from the operating system's disk cache rather than the hard drive itself. That would explain why their tests imply a bandwith of 250MB/s, which is pretty damn fast for a disk. In my tests with a 1.5GB file, both methods achieve a throughput of 40MB/s, indicating that the disk is the bottleneck, not the CPU. Of course, mileage with a solid state disk might differ.
You could improve the quality of this answer by letting us know why these links might be helpful. Link-only answers are not ideal.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.