0

I defined a class for feature union. The python 2.7 complain "AttributeError: 'module' object has no attribute "TextTransformer". The code can be runned on Kaggle's platform but cannot run on my local ipython.

from sklearn.base import BaseEstimator, TransformerMixin
class TextTransformer(BaseEstimator, TransformerMixin):
    def __init__(self, key):
        self.key = key
    def fit(self, x, y=None):
        return self
    def transform(self, data_dict):
        return data_dict[self.key].apply(str)

rfr = RandomForestRegressor()
tfidf = TfidfVectorizer()
tsvd = TruncatedSVD(n_components=10)
clf = pipeline.Pipeline([
    ('union', FeatureUnion(
                transformer_list = [
                    ('txt1', pipeline.Pipeline([('s1', TextTransformer(key='search_term')), ('tfidf1', tfidf), ('tsvd1', tsvd)])),
                    ('txt2', pipeline.Pipeline([('s2', TextTransformer(key='product_title')), ('tfidf2', tfidf), ('tsvd2', tsvd)])),
                    ('txt3', pipeline.Pipeline([('s3', TextTransformer(key='product_description')), ('tfidf3', tfidf), ('tsvd3', tsvd)])),
                    ('txt4', pipeline.Pipeline([('s4', TextTransformer(key='brand')), ('tfidf4', tfidf), ('tsvd4', tsvd)]))
                    ],
                transformer_weights = {
                    'txt1': 0.5,
                    'txt2': 0.25,
                    'txt3': 0.25,
                    'txt4': 0.5
                    },
            n_jobs = -1
            )), 
    ('rfr', rfr)])
param_grid = {'rfr__max_features': [10], 'rfr__max_depth': [20]}
model = grid_search.GridSearchCV(estimator = clf, param_grid = param_grid,n_jobs = -1, cv = 10)         
model.fit(X_train, y_train)

1 Answer 1

1

You probably forgot some import. Try this, it is working for me.

from sklearn.base import TransformerMixin
from sklearn.ensemble  import  RandomForestRegressor
from sklearn.feature_extraction import *
from sklearn.feature_extraction.text import *
from sklearn.decomposition import  *
from sklearn.pipeline import *
from sklearn.grid_search import *

class TextTransformer(TransformerMixin):
    def __init__(self, key):
        self.key = key

    def fit(self, x, y=None):
        return self

    def transform(self, data_dict):
        return data_dict[self.key].apply(str)

rfr = RandomForestRegressor()
tfidf = TfidfVectorizer()
tsvd = TruncatedSVD(n_components=10)
clf = Pipeline([
    ('union', FeatureUnion(
                transformer_list = [
                    ('txt1', Pipeline([('s1', TextTransformer(key='search_term')), ('tfidf1', tfidf), ('tsvd1', tsvd)])),
                    ('txt2', Pipeline([('s2', TextTransformer(key='product_title')), ('tfidf2', tfidf), ('tsvd2', tsvd)])),
                    ('txt3', Pipeline([('s3', TextTransformer(key='product_description')), ('tfidf3', tfidf), ('tsvd3', tsvd)])),
                    ('txt4', Pipeline([('s4', TextTransformer(key='brand')), ('tfidf4', tfidf), ('tsvd4', tsvd)]))
                    ],
                transformer_weights = {
                    'txt1': 0.5,
                    'txt2': 0.25,
                    'txt3': 0.25,
                    'txt4': 0.5
                    },
            n_jobs = -1
            )), 
    ('rfr', rfr)])
param_grid = {'rfr__max_features': [10], 'rfr__max_depth': [20]}
model = GridSearchCV(estimator = clf, param_grid = param_grid,n_jobs = -1, cv = 10)         
model.fit(X_train, y_train)
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for answering. But I have those imports. The error doesn't show on the ipython notebook but on the console panel. And ipython gives no response after running the code.
Sorry I misunderstood your error. I think you forgot to extend BaseEstimator. You should try with : class TextTransformer(BaseEstimator,TransformerMixin)
Also, the original code has the baseEstimator.. But I still have the errors. Do you run the script successfully on ipython?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.