Why different byte array representation in Java and Javascript?

Question

I was trying to see the UTF-8 bytes of 👍 in both Java and Javascript.

In Javascript,

new TextEncoder().encode("👍"); returns => [240, 159, 145, 141]

while in Java,

"👍".getBytes("UTF-8") returns => [-16, -97, -111, -115]

I converted those byte arrays to hex string using methods I found corresponding to the language (JS, Java) and both returned F09F918D

In fact, -16 & 0xFF gives => 240

I am curious to know more on why both language chooses different ways of representing byte arrays. It took me a while to figure out up to this.

one is signed, one is unsigned. Still the same binary representation — phuclv
– phuclv, Commented Oct 13, 2015 at 11:10

Community · Accepted Answer · 2017-05-23 12:03:52Z

4

In Java all bytes are signed. Therefore, the range of one byte is from -128 to 127. In Javascript though, the returned values are, well, simply speaking integers. So it can be represented in decimal using the full range up to 255.

Therefore, if you convert both result to 1 byte hexadecimal representation - those would be the same: F0 9F 91 8D.

Speaking of why java decided to eliminate unsigned types, that is a separate discussion.

edited May 23, 2017 at 12:03

CommunityBot

11 silver badge

answered Oct 13, 2015 at 10:08

bezmax

26.3k11 gold badges55 silver badges84 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Why different byte array representation in Java and Javascript?

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related