6

In Python 3, stdin and stdout are TextIOWrappers that have an encoding and hence spit out normal strings (not bytes).

I can change the encoding that is being used with an environment variable PYTHONIOENCODING. Is there also a way to change this in my script itself?

3 Answers 3

6

Actually TextIOWrapper does return bytes. It takes a Unicode string and returns a byte string in a particular encoding. To change sys.stdout to use a particular encoding in a script, here's an example:

Python 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print('\u5000')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\dev\python32\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u5000' in position 0: character maps to <undefined>>>> import io
>>> import io
>>> import sys
>>> sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf8')
>>> print('\u5000')
倀

(my terminal isn't UTF-8)

sys.stdout.buffer accesses the raw byte stream. You can also use the following to write to stdout in a particular encoding:

sys.stdout.buffer.write('\u5000'.encode('utf8'))
Sign up to request clarification or add additional context in comments.

Comments

2

Since Python 3.7 TextIOWrapper has a reconfigure() method that can change stream settings, including the encoding:

sys.stdout.reconfigure(encoding='utf-8')

One caveat: You can only change the encoding of sys.stdin if you haven't started reading from it.

Comments

0

I'm pretty sure this is not possible. It explicitly says in the documentation that "If this is set before running the interpreter, it overrides the encoding used for stdin/stdout/stderr"

also i got an error when trying to change sys.__stdin__.encoding saying:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: readonly attribute

EDIT: In python 2.x it was possible to change the encoding of stdin/out/err from within the script. In python 3.x it seems like you have to use locale (or set the environment variable from the command line before running your script).

EDIT: this might be interesting to read for you http://comments.gmane.org/gmane.comp.python.ideas/15313

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.