Convert from string containing hexadecimal characters to bytes in python 3

Question

I have a string that contains printable and unprintable characters, for instance:

'\xe8\x00\x00\x00\x00\x60\xfc\xe8\x89\x00\x00\x00\x60\x89'

What's the most "pythonesque" way to convert this to a bytes object in Python 3, i.e.:

b'\xe8\x00\x00\x00\x00`\xfc\xe8\x89\x00\x00\x00`\x89'

Martijn Pieters · Accepted Answer · 2014-02-24 22:14:16Z

4

If all your codepoints are within the range U+0000 to U+00FF, you can encode to Latin-1:

inputstring.encode('latin1')

as the first 255 codepoints of Unicode map one-to-one to bytes in the Latin-1 standard.

This is by far and away the fastest method, but won't work for any characters in the input string outside that range.

Basically, if you got Unicode that contains 'bytes' that should not have been decoded, encode to Latin-1 to get the original bytes again.

Demo:

>>> '\xe8\x00\x00\x00\x00\x60\xfc\xe8\x89\x00\x00\x00\x60\x89'.encode('latin1')
b'\xe8\x00\x00\x00\x00`\xfc\xe8\x89\x00\x00\x00`\x89'

answered Feb 24, 2014 at 22:14

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

1 Answer 1