4

A simple print function

def TODO(message):
    print(type(message))
    print(u'\n~*~ TODO ~*~ \n %s\n     ~*~\n' % message)

called like this

TODO(u'api servisleri için input check decorator gerekiyor')

results in this error

<type 'unicode'>                                                                                 
Traceback (most recent call last):                                                               
  File "/srv/www/proj/__init__.py", line 38, in <module>                                      
    TODO(u'api servisleri için input check decorator gerekiyor')                                 
  File "/srv/www/proj/helpers/utils.py", line 33, in TODO                                     
    print(u'\n~*~ TODO ~*~ \n %s\n     ~*~\n' % message)                                         
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 32: ordinal not in range(128)

But it works in ipython console

In [10]: TODO(u'api servisleri için input check decorator gerekiyor')
<type 'unicode'>

~*~ TODO ~*~ 
 api servisleri için input check decorator gerekiyor
     ~*~

This works with python 2.7.12 but fails somehow with 2.7.9.

What is it that am i doing wrong here?

Edit: function fails when called in a flask application, works in python console.

12
  • I just tried your code on multiple versions of python, 2.6.6, 2.7.9, 2.7.10 and 2.7.13, from the command line, and your code worked fine. Commented Jan 10, 2017 at 5:44
  • It sounds like your console locale is broken and has defaulted to ASCII. What OS are you using? Commented Jan 10, 2017 at 7:29
  • 1
    I assume you're on a Un*x system. See: stackoverflow.com/a/35839964/1554386 Commented Jan 10, 2017 at 8:02
  • With no further information, I vote to close this as "not reproducible". This is likely to be a simple locale issue. Commented Jan 10, 2017 at 20:48
  • 1
    @AlastairMcCormack both are 'UTF-8' Commented Jan 11, 2017 at 10:49

3 Answers 3

0

Different terminals (and GUIs) allow different encodings. I don't have a recent ipython handy, but it is apparently able to handle the non-ASCII 0xe7 character ('ç') in your string. Your normal console, however, is using the 'ascii' encoding (mentioned by name in the exception), which can't display any bytes greater than 0x7f.

If you want to print non-ASCII strings to an ASCII console, you'll have to decide what to do with the characters it can't display. The str.encode method offers several options:

str.encode([encoding[, errors]])

errors may be given to set a different error handling scheme. The default for errors is 'strict', meaning that encoding errors raise a UnicodeError. Other possible values are 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' and any other name registered via codecs.register_error(), see section Codec Base Classes.

Here's an example that uses each of those four alternative error-handlers on your string (without the extra decoration added by TODO):

#!/usr/bin/env python2
# -*- coding: utf-8 -*-

from __future__ import print_function

uni = u'api servisleri için input check decorator gerekiyor'
handlers = ['ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace']
for handler in handlers:
    print(handler + ':')
    print(uni.encode('ascii', handler))
    print()

The output:

ignore:
api servisleri iin input check decorator gerekiyor

replace:
api servisleri i?in input check decorator gerekiyor

xmlcharrefreplace:
api servisleri i&#231;in input check decorator gerekiyor

backslashreplace:
api servisleri i\xe7in input check decorator gerekiyor

Which one of those outputs comes closest to what you want is for you to decide.

For more information, see the Python 2 "Unicode HOWTO", and Ned Batchelder's "Pragmatic Unicode, or, How Do I Stop the Pain?", also available as a 36 minute video from PyCon US 2012.

Edit: ...or, as you seem to have discovered, your terminal can display Unicode just fine, but your default encoding is nevertheless set to 'ascii', which is more restrictive than it needs to be.

Sign up to request clarification or add additional context in comments.

Comments

-1

\xe7

One of the utf-8 character that represents small 'ç'. Python 2.7.9 probably encode with ASCII. You can run the code below in any version of Python that represents Python 2.7.9's behaviour.

import sys; 
# -*- coding: utf-8 -*-

def TODO(message):
    print(type(message))
    print(u'\n~*~ TODO ~*~ \n %s\n     ~*~\n' % message)

message = u'api servisleri için input check decorator gerekiyor'
encodedMessage = message.encode('ascii')

print(sys.stdout.encoding)
TODO(encodedMessage)

It will throw the exception

Traceback (most recent call last): File "test.py", line 9, in encodedMessage = message.encode('ascii') UnicodeEncodeError: 'ascii' codec can't encode character '\xe7' in position 16: ordinal not in range(128)

So, issue is related with interpreter's encoding rules. You can encode on your own or ignore.

Hope it will be useful

Comments

-1

Apparently, print function is a bit different from the print statement.

https://docs.python.org/2.7/library/functions.html#print

All non-keyword arguments are converted to strings like 
str() does and written to the stream, separated by sep 
and followed by end. 

Simply, encoding the unicode string solved it

msg = u'\n~*~ TODO ~*~ \n %s\n     ~*~\n' % message
print(msg.encode("utf-8"))

Still, not sure why it works with 2.7.12, maybe a locale thing?

2 Comments

This isn't the answer. See @Alastair comment and fix your environment. What you are doing is manually encoding to UTF-8, which wouldn't work on a non-UTF8 terminal. Configure your terminal to report UTF-8 correctly to Python and print u'için' will work. Print function vs. statement is a red herring, Python 2 doesn't have print function unless you use from __future__ import print_function. It changes to a function in Python 3.
Both environments have the same locale output. And as stated in the document I have mentioned, this is the expected output of print function. this is the output of print on the machine that the print function failed. ╰─$ python Python 2.7.9 (default, Jun 29 2016, 13:08:31) [GCC 4.9.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> print u'için' için

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.