For my machine learning code, I have some unknown values with '?' in my csv file. So, I am trying to replace them with 'Nan' but it throws some error. The following code is for the replacement of '?' that I have used. Can anyone please solve this? Thanks in advance !
import numpy
import pandas as pd
import matplotlib as plot
import numpy as np
df = pd.read_csv('cdk.csv')
x=df.iloc[:,0:24].values
y=df.iloc[:,24].values
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values='NaN', strategy='most_frequent', axis =0,copy=False)
imputer = imputer.fit(x[:,0:5])
imputer.fit_transform(x[:,0:5])
imputer = Imputer(missing_values='normal', strategy='mode', axis =0,copy=False)
imputer = imputer.fit(x[:,5:7])
imputer.fit_transform(x[:,5:7])
This is what error it throws,
Traceback (most recent call last):
File "kidney.py", line 10, in <module>
imputer = imputer.fit(x[:,0:5])
File "C:\Users\YAASHI\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\preprocessing\imputation.py", line 155, in fit
force_all_finite=False)
File "C:\Users\YAASHI\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\validation.py", line 433, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: could not convert string to float: '?'
ValueError. It has nothing to do with machine learning, so please do not tag as such?with NaN, but you have showed no code which mentions?. Where is your code to replace the?s?