In Jython, how can I create unicode string from UTF-8 byte sequence?

Question

A Japanese Unicode character 'あ's UTF-8 representation is a three bytes sequence, E38182. And I have it in a Jython's list like this;

>>> [0xE3, 0x81, 0x82]
[227, 129, 130]

Can I convert this UTF-8 byte sequcne list to a Jython's unicode string? I want to output 'あ' by printing the unicode string like the following;

str = convert_utf8_list_to_unicode([0xE3, 0x81, 0x82])
print str # => あ

Environment

user1524220 · Accepted Answer · 2014-06-25 13:30:48Z

1

Try this:

a = [0xE3, 0x81, 0x82]
print "".join([chr(c) for c in a]).decode('UTF-8')

This works in regular Python for me. I don't know if it is different in Jython.

answered Jun 25, 2014 at 13:30

user1524220

1602 silver badges8 bronze badges

Sign up to request clarification or add additional context in comments.