ValueError: could not convert string to float: sklearn, numpy, panda

Question

Im trying to convert car names from NumPy array to numeric values to use for linear regressor. The label encoder gives warning: ValueError: could not convert string to float: 'porsche' Can someone help, please?

Heres the code:

 from sklearn.preprocessing import StandardScaler
 from sklearn.preprocessing import LabelEncoder, OneHotEncoder
 enc = LabelEncoder()
 enc.fit_transform(Z[:,0:1])
 onehotencoder = OneHotEncoder(categorical_features = [0])
 Z = onehotencoder.fit_transform(Z).toarray()`

and outoput: ValueError: could not convert string to float: 'porsche'

And here is the array: Array name = Z, type str416,

Please, paste the array source as a reproducible code, not as an image. — Yann
– Yann, Commented Jan 19, 2020 at 13:14

YOLO · Accepted Answer · 2020-01-19 10:32:41Z

1

For one hot encoding, I would suggest you to use pd.get_dummies instead, much easier to use:

# make sure Z is a dataframe
X = pd.get_dummies(Z).values

If you want to use sklearn's OHE, you can refer to the following example:

from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder

df = pd.DataFrame({'a':['audi','porsche','audi'], 'b':[1,2,3]})
ohe = OneHotEncoder()

mat = ohe.fit_transform(df[['a']])

# view the contents of array
mat.todense()

matrix([[1., 0.],
        [0., 1.],
        [1., 0.]])

# get feature names
ohe.get_feature_names()
array(['x0_audi', 'x0_porsche'], dtype=object)

edited Jan 19, 2020 at 10:32

answered Jan 19, 2020 at 10:27

YOLO

22k5 gold badges25 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

ValueError: could not convert string to float: sklearn, numpy, panda

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related