4

My problem is, I can output Unicode charaters into my terminal but not into files. Demonstration:

user@ubuntu:~$ python -c 'print u"\u5000"'
倀
user@ubuntu:~$ python -c 'print u"\u5000"' >a.out
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u5000' in position 0: ordinal not in range(128)

Output of "locale":

LANG=en_US.UTF-8
LANGUAGE=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8

2 Answers 2

4

Because your terminal is set to use UTF-8, Python knows how to encode a Unicode character when writing directly to the terminal. When writing to the file, however, there is no encoding specified, so Python defaults to ASCII. To write to the file, you need to explicitly specify a byte encoding.

python -c 'print u"\u5000".encode("UTF-8")' >a.out
Sign up to request clarification or add additional context in comments.

Comments

1

The problem was actually with Python. A solution was setting PYTHONIOENCODING=utf_8.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.