I have ndarray like this. I am writing it to a dataframe, saving as a pickle, reading that pickle, and then creating new array again. Why does np.array_equal(my_array2,X_train) return false? i tried to debug and have written some code to understand the problem but having a hard time
How should I change the code so that both arrays match?
X_train=array([[" I I want to know how much s it thank you"],
[" press any key to connect P Thank you Too <unk> I "]],
dtype='<U97064')
X_train
X_train[0]
#array([[' I I want to know how much s it thank you'],
[' press any key to connect P Thank you Too <unk> I ']],
dtype='<U97064')
df = pd.DataFrame(X_train, columns = ['Column_A'])
df.to_pickle('df.pkl')
df2 = pd.read_pickle('df.pkl')
my_array2= df2['Column_A'].to_numpy(dtype='<U97064')
np.array_equal(my_array2[0],X_train[0])
#false
np.array_equal(my_array2,X_train)
#false
type of arrays match
print (type(my_array2))
print (type(X_train))
#<class 'numpy.ndarray'>
#<class 'numpy.ndarray'>
but individual members dont match
#not sure why datatype of individual elements is different
print (type(my_array2[0]))
print (type(X_train[0]))
#<class 'numpy.str_'>
#<class 'numpy.ndarray'>
X_train.dtype
#dtype('<U97064')
type(X_train.dtype)
#numpy.dtype
pandashas changed your array indf. A dataframe is 2d, and a column is 1d. Compare the shape of yourX_trainanddf[column].to_numpy(). You can save numpy arrays without involving pandas.