0

Im trying to convert car names from NumPy array to numeric values to use for linear regressor. The label encoder gives warning: ValueError: could not convert string to float: 'porsche' Can someone help, please?

Heres the code:

 from sklearn.preprocessing import StandardScaler
 from sklearn.preprocessing import LabelEncoder, OneHotEncoder
 enc = LabelEncoder()
 enc.fit_transform(Z[:,0:1])
 onehotencoder = OneHotEncoder(categorical_features = [0])
 Z = onehotencoder.fit_transform(Z).toarray()`

and outoput: ValueError: could not convert string to float: 'porsche'

And here is the array: Array name = Z, type str416,

1
  • Please, paste the array source as a reproducible code, not as an image. Commented Jan 19, 2020 at 13:14

1 Answer 1

1

For one hot encoding, I would suggest you to use pd.get_dummies instead, much easier to use:

# make sure Z is a dataframe
X = pd.get_dummies(Z).values

If you want to use sklearn's OHE, you can refer to the following example:

from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder

df = pd.DataFrame({'a':['audi','porsche','audi'], 'b':[1,2,3]})
ohe = OneHotEncoder()

mat = ohe.fit_transform(df[['a']])

# view the contents of array
mat.todense()

matrix([[1., 0.],
        [0., 1.],
        [1., 0.]])

# get feature names
ohe.get_feature_names()
array(['x0_audi', 'x0_porsche'], dtype=object)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.