6

In Python, there is an encode method in unicode strings to encode from unicode to byte string. There is a decode method in string to do the reverse.

But I'm confused what the encode method in string for?

2
  • Take a look at this presentation 'Unicode in Python, Completely Demystified' farmdev.com/talks/unicode Commented Mar 3, 2011 at 6:47
  • I've seen that. It doesn't explain my question. Commented Mar 3, 2011 at 12:42

2 Answers 2

10

It's useful for non-text codecs.

>>> 'Hello, world!'.encode('hex')
'48656c6c6f2c20776f726c6421'
>>> 'Hello, world!'.encode('base64')
'SGVsbG8sIHdvcmxkIQ==\n'
>>> 'Hello, world!'.encode('zlib')
'x\x9c\xf3H\xcd\xc9\xc9\xd7Q(\xcf/\xcaIQ\x04\x00 ^\x04\x8a'
Sign up to request clarification or add additional context in comments.

2 Comments

Wow, it even works if the encoded string is incompatible with the default encoding! That must mean it doesn't always decode the string to unicode first...
Ok, so looks like it'll decode to unicode if we encode to one of the character encodings. Strange.
5

It first decodes to Unicode using the default encoding, then encodes back to a byte string.

>>> import sys
>>> sys.getdefaultencoding()
'ascii'
>>> sys.setdefaultencoding('latin-1')
>>> '\xc4'.encode('utf-8')
'\xc3\x84'

Here, '\xc4' is Latin-1 for Ä and '\xc3\x84' is UTF-8 for Ä.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.