2

I was trying to see the UTF-8 bytes of 👍 in both Java and Javascript.

In Javascript,

new TextEncoder().encode("👍"); returns => [240, 159, 145, 141]

while in Java,

"👍".getBytes("UTF-8") returns => [-16, -97, -111, -115]

I converted those byte arrays to hex string using methods I found corresponding to the language (JS, Java) and both returned F09F918D

In fact, -16 & 0xFF gives => 240

I am curious to know more on why both language chooses different ways of representing byte arrays. It took me a while to figure out up to this.

1
  • 1
    one is signed, one is unsigned. Still the same binary representation Commented Oct 13, 2015 at 11:10

1 Answer 1

4

In Java all bytes are signed. Therefore, the range of one byte is from -128 to 127. In Javascript though, the returned values are, well, simply speaking integers. So it can be represented in decimal using the full range up to 255.

Therefore, if you convert both result to 1 byte hexadecimal representation - those would be the same: F0 9F 91 8D.

Speaking of why java decided to eliminate unsigned types, that is a separate discussion.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.