I have many arrays of same dimension,such as
x = np.array([3,2,0,4,5,2,1...]) #the dimension of the vectors is above 50000
y = np.array([1,3,4,2,4,1,4...])
What I want to do is to use Feature Hashing to reduce the dimensionality of these vectors(although there will be collisions).Then lower dimension vectors can be employed in classifiers.
What I have tried is
from sklearn.feature_extraction import FeatureHasher
hasher = FeatureHasher()
hash_vector = hasher.transform(x)
However, it seems that FeatureHasher cannot be used directly and it saysAttributeError: 'matrix' object has no attribute 'items'
Therefore, in order to do feature hashing smoothly, what should I do next? Can anyone let me know if I am missing something? Or if there is another way to do feature hashing more effectively?