I want to stem the words, for which i import the porterstemmer pkg from nltk but an error occurred at run time.
The error is :
TypeError: coercing to Unicode: need string or buffer, file found
My Python code is
import nltk;
from nltk.stem import PorterStemmer
stemmer=PorterStemmer()
file = open('C:/Python26/test.txt','r')
f=open("root.txt",'w')
with open(file,'r',-1) as rf:
lines = rf.readlines()
for word in lines:
root = stemmer.stem(word)
f.write(root+"\n")
f.close()
yes i tried it and got an error which i couldn't understand ad the error was 1.6.2 Traceback (most recent call last): File "C:\Python26\check.py", line 10, in with open(file,'r',-1) as rf: UnicodeDecodeError: 'ascii' codec can't decode byte 0xf8 in position 6: ordinal not in range(128)
My code after ur recommended change is import nltk; import numpy; import numpy as np from StringIO import StringIO print numpy.__version__ from nltk.stem import PorterStemmer stemmer=PorterStemmer() file = np.genfromtxt('C:/Python26/test.txt', delimiter=" ") f=open("root.txt",'w') with open(file,'r',-1) as rf: lines = rf.readlines() for word in lines: root = stemmer.stem(word) f.write(root+"\n") f.close() and my dummy file is like thiswalking
talked
oranges
books
Src
Src
mAB