3

I'm trying to run through the TextBlob tutorial in Windows (using Git Bash shell) with Python 3.3.

I've installed textblob and nltk as well as any dependencies.

The Python code is:

from text.blob import TextBlob

wiki = TextBlob("Python is a high-level, general-purpose programming language.")
tags = wiki.tags

I'm getting the following error

Traceback (most recent call last):
File "textblob.py", line 4, in <module> 
  tags = wiki.tags
File "c:\Python33\lib\site-packages\text\decorators.py", line 18, in __get__ 
  value = obj.__dict__[self.func.__name__] = self.func(obj)
File "c:\Python33\lib\site-packages\text\blob.py", line 357, in pos_tags 
  for word, t in self.pos_tagger.tag(self.raw)
File "c:\Python33\lib\site-packages\text\taggers.py", line 40, in tag
  return pattern_tag(sentence, tokenize)
File "c:\Python33\lib\site-packages\text\en.py", line 115, in tag
  for sentence in parse(s, tokenize, True, False, False, False, encoding).split():
File "c:\Python33\lib\site-packages\text\en.py", line 99, in parse
  return parser.parse(unicode(s), *args, **kwargs)
File "c:\Python33\lib\site-packages\text\text.py", line 1213, in parse
  s[i] = self.find_tags(s[i], **kwargs)
File "c:\Python33\lib\site-packages\text\en.py", line 49, in find_tags
  return _Parser.find_tags(self, tokens, **kwargs)
File "c:\Python33\lib\site-packages\text\text.py", line 1161, in find_tags
  map = kwargs.get(     "map", None))
File "c:\Python33\lib\site-packages\text\text.py", line 967, in find_tags
  tagged.append([token, lexicon.get(token, i==0 and lexicon.get(token.lower()) or   None)])
File "c:\Python33\lib\site-packages\text\text.py", line 98, in get
  return self._lazy("get", *args)
File "c:\Python33\lib\site-packages\text\text.py", line 79, in _lazy
  self.load()
File "c:\Python33\lib\site-packages\text\text.py", line 367, in load
  dict.update(self, (x.split(" ")[:2] for x in _read(self._path) if x.strip()))
File "c:\Python33\lib\site-packages\text\text.py", line 367, in <genexpr>
  dict.update(self, (x.split(" ")[:2] for x in _read(self._path) if x.strip()))
File "c:\Python33\lib\site-packages\text\text.py", line 346, in _read
  for line in f:
File "c:\Python33\lib\encodings\cp1252.py", line 23, in decode
  return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 16: character maps to <undefined>

Any idea what is wrong here? Adding a 'u' before the string didn't help.

10
  • I ran through the tutorial quickly and it worked OK on my OS X machine using Python 3.3. Do you perhaps have an old version of TextBlob? It looks like a similar problem was just fixed and released: github.com/sloria/TextBlob/issues/15 Commented Sep 24, 2013 at 17:25
  • No luck. I'm using 0.6.3 which I believe is the latest. I did a pip --force-reinstall and did notice a libyaml error when installing pyyaml. The install did continue though so I'm not sure this is a serious issue. Commented Sep 24, 2013 at 17:49
  • In continuing to mess around with this, I ran through a brief tutorial from the front page of the nltk site and hit a very similar error. Cloning from the master repo on github solved the problem though. Maybe I need to try something similar with textblob. Commented Sep 24, 2013 at 18:24
  • do you get an error when you try: sentence = "Python is a high-level, general-purpose programming language.".encode('utf8') Commented Sep 25, 2013 at 10:51
  • Adding that line does not cause an error on that line. Commented Sep 25, 2013 at 18:39

1 Answer 1

3

Release 0.7.1 fixes this issue, which means it's time for a

$ pip install -U textblob

The problem was that the en-lexicon.txt file used for part-of-speech tagging opened the file using Windows' default platform encoding, cp1252. The file apparently had characters that Python could not decode from this encoding. This was fixed by explicitly opening the file in utf-8 mode.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.