2

i'm trying to print : Pokémon GO Việt Nam

print u"Pokémon GO Việt Nam"

and i'm getting :

print u"PokÚmon GO Vi?t Nam"
SyntaxError: (unicode error) 'utf8' codec can't decode byte 0xe9 in position 0: unexpected end of data

i've tried :

.encode("utf-8")
.decode("utf-8")
.decode('latin-1').encode("utf-8")
unicode(str.decode("iso-8859-4"))

My python version is 2.7.9 , Notepad++ UTF-8 encoding . with no luck , how can i print it ? and i'm encountering this kind of issues all the time , what's the proper way to debug and get the right encoding ?

4
  • What version of python are you using? I printed this using python 3.5 and it worked fine. Commented Aug 31, 2016 at 21:30
  • Are you typing it or are you getting it from a different source? Copying and pasting from SO yields correct results on both 2.7 and 3.5, on my OS. Commented Aug 31, 2016 at 21:32
  • Using Python 3+ works with print as a function Commented Aug 31, 2016 at 21:32
  • my python version is 2.7.9 Commented Aug 31, 2016 at 21:40

3 Answers 3

4
#!/usr/bin/python
# -*- coding: utf-8 -*-

print "Pokémon GO Việt Nam"

You can find here more info

For PyCharm settings, go to the menu: PyCharm --> Preference then use the search to look up "encoding", you should reach the following screen:

enter image description here

Sign up to request clarification or add additional context in comments.

13 Comments

yeah try that also , it work if we just print u"pokémon" but not "Pokémon GO Việt Nam"
@BrendaMartinez: what encoding is your editor using?
@BrendaMartinez make sure that both your IDE encoding as well as the project encoding are set to 'utf-8'
Utf-8 , notepad++
@BrendaMartinez well notepad++ is another problem :) use a proper IDE such as PyCharm: you'll get all the benefits of IDE such as debugging and inspection capabilities as well as many other goodies.
|
1

Specify the encoding

#!/usr/bin/python
# -*- coding: utf-8 -*-

in the top of the program

1 Comment

yeah tried that also , it work if we just print u"pokémon" but not "Pokémon GO Việt Nam"
0

As an alternative you can encode the unicode string:

print u"Pokémon GO Việt Nam".encode('utf-8')

The advantage is that the bytes in the resulting string are independent of the encoding of the source file: u"ệ".encode('utf-8') is always the same 3 bytes "\xe1\xbb\x87".

It is also consistent with what you'd do if you have an unicode string in a variable.

# get text from somewhere...
text = u"Pokémon GO Việt Nam"

# assuming your terminal expects UTF-8 -- this won't work on Windows.
print text.encode('utf-8')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.