AttributeError: 'str' object has no attribute 'words'

Question

I'm using Python34. I want to get frequency of words from CSV file but it show an error. Here is my code.Anyone help me to solve this problem.

from textblob import TextBlob as tb
import math

words={}
def tfidf(word, blob, bloblist):
    return tf(word, blob) * idf(word, bloblist)

def tf(word, blob):
    return blob.words.count(word) / len(blob.words)

def n_containing(word, bloblist):
    return sum(1 for blob in bloblist if word in blob)

def idf(word, bloblist):
    return math.log(len(bloblist) / (1 + n_containing(words, bloblist)))

bloblist = open('afterstopwords.csv', 'r').read()

for i, blob in enumerate(bloblist):
     print("Top words in document {}".format(i + 1))
     scores = {word: tfidf(word, blob, bloblist) for word in blob.words}
     sorted_words = sorted(scores.items(), key=lambda x: x[1], reverse=True)
     for word, score in sorted_words[:3]:
         print("\tWord: {}, TF-IDF: {}".format(word, round(score, 5)))

And the error is:

 Top words in document 1
 Traceback (most recent call last):
 File "D:\Python34\tfidf.py", line 45, in <module>
    scores = {word: tfidf(word, blob, bloblist) for word in blob.words}
 AttributeError: 'str' object has no attribute 'words'

the error message is pretty clear: blob is a string, a string does not have a words attribute => you can't do blob.words — Julien Spronck
– Julien Spronck, Commented May 14, 2015 at 6:08
I don't know ... there seems to be many problems with that code. Why are you importing TextBlob since you don't use it anywhere? did you mean to use it but forgot? — Julien Spronck
– Julien Spronck, Commented May 15, 2015 at 6:55
i know the error is the same. I was just telling you that i have no clue what you are trying to do and therefore cannot tell you how to do it — Julien Spronck
– Julien Spronck, Commented May 15, 2015 at 8:58

JNault · Accepted Answer · 2016-07-20 22:12:43Z

from http://stevenloria.com/finding-important-words-in-a-document-using-tf-idf/ some of the code for bloblist is:

bloblist = [document1, document2, document3]

don't change it. Plus, preceding it are code for the documents like:

document1 = tb("""blablabla""")

Here's what I did...I use a function for opening files in my python, where openfile holds the file details.

txt =openfile()  
document1=tb(txt)  
bloblist = [document1]

THe rest of the original code is unchanged. This works BUT I have only been able to get it to finish small files. It takes much too long for larger files. And it doesn't look accurate at all. For word count I use https://rmtheis.wordpress.com/2012/09/26/count-word-frequency-with-python/
and it has worked very quickly for 9999 rows each being 50-75 characters long. Seems accurate too, results seem equivalent to wordcloud results.

Collectives™ on Stack Overflow

AttributeError: 'str' object has no attribute 'words'

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related