3

I am trying to convert a byte array to String. But the conversion alter the values. That means I cannot restore the byte array from the converted String.

byte[] array = {-64,-88,1,-2};
ByteArrayOutputStream out = new ByteArrayOutputStream();
out.write(array);
String result = out.toString("UTF-8");
byte[] array2 = result.getBytes("UTF-8");
// output of array2: {-17,-65,-67,-17}
1
  • Actually, the result is longer than you wrote: [-17, -65, -67, -17, -65, -67, 1, -17, -65, -67]. Commented Dec 27, 2015 at 11:16

3 Answers 3

4

It's a charset issue - utf-8 has more than 1 byte per char. Try the same with some 1-byte charset like

String result = out.toString("ISO-8859-15");
byte[] array2 = result.getBytes("ISO-8859-15");
Sign up to request clarification or add additional context in comments.

Comments

2

You have to use a fixed single byte encoding, like the one Jan suggested. UTF-8 is a non-fixed encoding, that means, in certain cases you need more then one byte to encode a single code point. This is one of this cases since you use negative numbers. (See the table in the wiki page about utf-8)

What was interesting for me was the fact, that after converting the second array to a string, the strings were identical but the underlying arrays where not. But the point is, that the given character are not legit code points (or utf-8 representation of it) in which case the get replaced with the code point 65533, which in turn needs 3 bytes to be represented which explains the output:

[-17, -65, -67, -17, -65, -67, 1, -17, -65, -67]

The first two code points are represented as -17, -65, -67 and represent the illegal code point. The 1 represents a legit code point, so it "survived" the transformation and then last is again an illegal one.

Comments

-1

I believe you can create a string out of an byte array by passing the array into the constructor like this

String test = new String(byte_array);

Also there's a method for String to convert a String to a byte-array that returns the array

I hope that helped at least a bit

1 Comment

From the javadoc: Constructs a new String by decoding the specified array of bytes using the platform's default charset.. Even if that works, it is a bad idea to rely on this as it may produce different results on other machines.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.