0

I have the below code :

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import re
import nltk
from nltk.corpus import stopwords
from nltk.stem.porter import PorterStemmer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.utils import shuffle
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import confusion_matrix
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from nltk.tokenize import sent_tokenize, word_tokenize
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score
import sklearn.metrics as metrics
import pickle

#%matplotlib inline
import warnings
warnings.filterwarnings('ignore')

stemmer = PorterStemmer()
words = stopwords.words("english")

from sklearn.feature_extraction.text import TfidfVectorizer
vectorizer_tfidf = TfidfVectorizer(stop_words='english', max_df=0.7)

# call and load pickle here
content = pickle.load(open("vectorizer.pk",'rb'))

vectorizer_tfidf = [vectorizer_tfidf]
test_tfIdf = vectorizer_tfidf.transform('processedtext')
test_tfIdf2 = vectorizer_tfidf.transform('processedtext2')

testdata = pd.read_csv('C:\\Users\\joyce\\Desktop\\CR_Summary 08052020.csv', delimiter = ',')
content = pickle.load(open("Pickle_RL_Model.pkl",'rb'))
 
##print (content)    
testdata=testdata.fillna(value='test')

#Array to return prediction
content.predict(testdata)

Error Message:

File "C:/Users/joyce/nltk CR data v3.py", line 42, in test_tfIdf = vectorizer_tfidf.transform('processedtext') AttributeError: 'list' object has no attribute 'transform'

How do I correct this error?

1
  • 1
    vectorizer_tfidf = [vectorizer_tfidf] this line (line 40) creates List called vectorizer_tfidf. Get rid of it. Commented Aug 26, 2020 at 17:37

2 Answers 2

1

Error means: transform function cannot be applied to a list .
In your case vectorizer_tfid is a list , thats why an error is shown.

vectorizer_tfidf = [vectorizer_tfidf]

This line creates the list.
Try removing it.

Sign up to request clarification or add additional context in comments.

6 Comments

Error:File "C:\Users\joyce\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\feature_extraction\text.py", line 1103, in transform "Iterable over raw text documents expected, " ValueError: Iterable over raw text documents expected, string object received.
the string is "processedtext" , which the function is not expecting.What are processedtext and processedtext2.
dataset['processedtext'] = dataset['SUMMARY'].apply(lambda x: " ".join([stemmer.stem(i) for i in re.sub("[^a-zA-Z]", " ", str(x)).split() if i not in words]).lower()) dataset['processedtext2'] = dataset['Details'].apply(lambda x: " ".join([stemmer.stem(i) for i in re.sub("[^a-zA-Z]", " ", str(x)).split() if i not in words]).lower())
i can send the py file to you if you want , where i built the pickle file
Please refer my posted answer, in the doc link you will find the correct usage of transform() function. You are passing 'str' it accepts an iterable object like a list. Here: scikit-learn.org/stable/modules/generated/…
|
1

Please refer to python docs and you will see that you are calling transform() method on a list object which doesn't support it. Please visit to sklearn docs to know more about the correct usage.

Minimally, you can remove this call:

vectorizer_tfidf = [vectorizer_tfidf]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.