1

I'm passing data between c# & Java, converting them in 4 stages:

  1. to byte array
  2. to string (simply adding each byte as character)
  3. to UTF8 bytes 4 to base64 string

What I've found out that java conversion to UTF8 is different than c#.

I'll skip the base64 conversion in the code below.

Java code:

// The result is [-26, 16, 0, 0]
byte[] bytes = ByteBuffer.allocate(4).order(ByteOrder.LITTLE_ENDIAN).putInt(4326).array();

StringBuilder sb = new StringBuilder(bytes.length);
for (byte currByte : bytes) {
   sb.append((char) currByte);
}

// The result is [-17, -90, -66, 16, 0, 0]
byte[] utf8Bytes = sb.toString().getBytes("UTF-8");

C# code

MemoryStream objMemoryStream = new MemoryStream();
BinaryWriter objBinaryWriter = new BinaryWriter(objMemoryStream);
objBinaryWriter.Write(4326);

// The result [230, 16, 0, 0]
byte[] objByte = objMemoryStream.ToArray();
StringBuilder objSB = new StringBuilder();
foreach (byte objCurrByte in objByte)
{
    objSB.Append((char)objCurrByte);
}
string strBytes = objSB.ToString();

objBinaryWriter.Close();
objBinaryWriter.Dispose();

// The result is [195, 166, 16, 0, 0]
var result = UTF8Encoding.UTF8.GetBytes(strBytes);

The two end arrays are different although the input arrays/strings are the same. (Java just using signed byte for displaying - but the values are the same)

I'm not allowed to change the c# code because it is already used by clients..

How can i adjust, and what is the problem in my java code?

Note: Java manage to read the result base64 string from c#, but then it is generating with the same data different string that c# cannot read properly..

1 Answer 1

1

The problem you have is that char is unsigned but byte is signed. When you do (char) -26 you are doing (char) (-26 & 0xFFFF) which what you intended was (char) (-26 & 0xFF)

Try

for (byte currByte : bytes) {
   sb.append((char) (currByte & 0xFF)); // -26 => 230 not 65510
}
Sign up to request clarification or add additional context in comments.

2 Comments

So simple :) I really ignored the signed/unsigned differences.. I just added 256 in my head, and thought every thing is the same.. Thanks allot
@RazizaO operations on byte, char and short are widened to int, first, which is easily missed if you use a cast.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.