0

I 'am trying to encode a dataframe with LabelEncoder() before creating my machine learning model

here the code :

from sklearn.preprocessing import LabelEncoder
# LabelEncoder
le = LabelEncoder()

# apply "le.fit_transform"
df_encoded = data1.apply(le.fit_transform)
print(df_encoded)
print(le.classes_)

But I got this error :

TypeError: ("'<' not supported between instances of 'str' and 'NoneType'", 'occurred at index SACC_MARKET_SEGMENT')

Anyone can help ùme to resolve this problem? tahnk you

1 Answer 1

1

There can be a problem with type of your data. I don't know what's your desired data type, but you can try converting data1 to string:

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()

df_encoded = le.fit_transform(data1.astype(str))
print(df_encoded)
print(le.classes_)
Sign up to request clarification or add additional context in comments.

4 Comments

thank you for your answer, yes there are NaN missing values in this dataframe. I try you code but I got this error :` ValueError: bad input shape (266572, 11)`
that's because LabelEncoder needs 1D array. If you want to encode 2D array (like a DataFrame), try using OrdinalEncoder - works the same, but it's rather used to encode 2D stuff. All you have to do is take my code and replace LabelEncoder with OrdinalEncoder.
Ok thanks, but I got this error: cannot import name 'OrdinalEncoder'
Maybe you have an old version of sklearn and you don't have OrdinalEncoder available yet. Check your version: print(sklearn.__version__). If it's smaller 0.20.2, then try to upgrade your sklearn package. If you use pip as package manager, try running this command in CMD: pip install scikit-learn --upgrade

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.