How to initialize bytes with an encoded, string type string

Question

The usual way we create a bytes variable in python is use the following way:

b = b'some text i do not care'

For example, the chinese character string "鲁邦三世" encode to bytes type is :

str_ch = "鲁邦三世"
encoded_str_ch = str_ch.encode("utf-8")
print(encoded_str_ch) # b'\xe9\xb2\x81\xe9\x82\xa6\xe4\xb8\x89\xe4\xb8\x96'

now if i have a string:

s = '\xe9\xb2\x81\xe9\x82\xa6\xe4\xb8\x89\xe4\xb8\x96' 
# same with encoded_str_ch, but it's string type

how can i initialize a bytes variable just use the variable s, not the encoded string '\xe9...\x96'

i tried

bytes(str_ch, encoding = "utf8")

but it's not correct, still got the same result with s

or there is not a way to do this...

Mark Tolonen · Accepted Answer · 2019-05-02 15:34:48Z

2

So you have a Unicode string, but the code points are really UTF-8 bytes? That generally means the string was decoded with the wrong codec. The following translates code points back to bytes, since latin1 is the first 256 code points and maps 1:1 back to bytes:

b = s.encode('latin1')

answered May 2, 2019 at 15:34

Mark Tolonen

181k26 gold badges182 silver badges278 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to initialize bytes with an encoded, string type string

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related