0

Python version: 2.7

Windows version: Windows 7 64-bit

Language of the system: Russian

I have a problem which has not been solved in the internet yet.

Here is my code:

 import textblob

 text = "I love people"

 text = TextBlob(text)
 print text.sentiment

I get the following error connected with the nltk method:

Traceback (most recent call last):
  File "C:\Users\Александр\Desktop\TextBlob.py", line 1, in <module>
    import textblob
  File "C:\Python27\lib\site-packages\textblob\__init__.py", line 9, in <module>
   from .blob import TextBlob, Word, Sentence, Blobber, WordList
   File "C:\Python27\lib\site-packages\textblob\blob.py", line 28, in <module>
    import nltk
  File "C:\Python27\lib\site-packages\nltk\__init__.py", line 128, in <module>
    from nltk.chunk import *
  File "C:\Python27\lib\site-packages\nltk\chunk\__init__.py", line 155, in <module>
   from nltk.data import load
  File "C:\Python27\lib\site-packages\nltk\data.py", line 77, in <module>
    if 'APPENGINE_RUNTIME' not in os.environ and os.path.expanduser('~/') != '~/':
  File "C:\Python27\lib\ntpath.py", line 311, in expanduser
    return userhome + path[i:]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc0 in position 9: ordinal not in range(128)

As far as I understood from answers from Google and Stackoverflow, the problem is related to language problems of ntpath.py.

I tried the following issues, and they did not work:

  1. Using sys.setdefaultencoding('utf8') How to fix: "UnicodeDecodeError: 'ascii' codec can't decode byte"

  2. Using sys.setdefaultencoding('Cp1252') It eliminated the error. However, the output of my programme disappeared too.

  3. Using import io. Python (nltk) - UnicodeDecodeError: 'ascii' codec can't decode byte

  4. Using unicode().decode() in ntpath.py (I do not remember a link where I found this solution).

UPD: I have found a solution.

I tried to insert this part into ntpath.py:

reload(sys)
sys.setdefaultencoding('Cp1252')

So, here is the part of the code in this file:

import os
import sys
import stat
import genericpath
import warnings

#another way
reload(sys)
sys.setdefaultencoding('Cp1252')

It works perfectly. If you have another language in your system settings, "play" with them and replace Cp1252.

8
  • This has nothing to do with NLTK, I think. The problem is that your path contains non-ASCII characters, which isn't handled properly. If you are new to Python, why aren't you working with Python 3? You will have much less trouble of this kind. Commented Nov 1, 2016 at 12:39
  • @lenz, I have tried to work in a 3.5 version, but I had a lot of troubles with compiling in exe files. 2.7 works pretty good with it. Can I somehow change the parameters of my system in order to avoid this problem? Commented Nov 1, 2016 at 14:30
  • Yes you can: Your username is "Александр", so userhome is probably r"C:\Users\Александр". Create a new user named Alexander (or Aleksandr, or Donald), so that folder paths only contain ascii characters. Commented Nov 1, 2016 at 14:39
  • @alexis, thank you a lot but I found a better solution) check my UPD above. Commented Nov 1, 2016 at 14:50
  • 1
    You should post your solution as an answer, not as an edit to the question. Commented Nov 1, 2016 at 15:21

1 Answer 1

1

I have found a solution.

I tried to insert this part into ntpath.py:

reload(sys)
sys.setdefaultencoding('Cp1252')

So, here is the part of the code in this file:

import os
import sys
import stat
import genericpath
import warnings

#another way
reload(sys)
sys.setdefaultencoding('Cp1252')

It works perfectly. If you have another language in your system settings, "play" with them and replace Cp1252.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.