how to decode a string(not bytes[]) in utf-8 format into another string in java?

Question

for some reason, I have to decode string in chinese character. like this “\u961c”，this string is utf-8 of “阜”

I know how to decode bytes[] into Unicode characters.but is there an easy way decode String into Unicode characters?

By the way,When I get “阜”.getBytes. I get -100,-104,-23. Is that means

1001110 10010100 11101001 in binary?

But I think \u961c　Unicode should be 1001 0110 0001 1100 in binary or something

and it's utf-8 format should be 11101001 10011000 10011100 in binary

阜 (U+961C) is \u961C in UTF-16 but E9 98 9C in UTF-8

phuclv
– phuclv

2016-03-11 03:23:58 +00:00
Commented Mar 11, 2016 at 3:23 — phuclv
– phuclv, Commented Mar 11, 2016 at 3:23

Remy Lebeau · Accepted Answer · 2016-03-11 02:56:37Z

1

In Java, there is no such method to encode a String object (not entirely accurate, there is an encoding, but that's UTF-16).

The only way is to encode to a byte[]. So if you need UTF-8 data, then you need a byte[]. If you have a String that contains unexpected data, then the problem is at some earlier place that incorrectly converted some binary data to a String (i.e. it was using the wrong encoding).

This one will work, but for bytes[]

Charset.forName("UTF-8").encode(myString)

edited Mar 11, 2016 at 2:56

Remy Lebeau

609k36 gold badges516 silver badges875 bronze badges

answered Mar 8, 2016 at 3:07

Isuru Rangana

1153 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Remy Lebeau Over a year ago

Charset.encode() returns a ByteBuffer. To get a byte[] from that, you would have call Charset.forName("UTF-8").encode(myString).array(). Otherwise, use myString.getBytes("UTF-8") or myString.getBytes(StandardCharsets.UTF_8) instead.

Collectives™ on Stack Overflow

how to decode a string(not bytes[]) in utf-8 format into another string in java?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related