This is my code to transform a lists of data to be fed into a Kmeans model. I want to visualize my clusters in a 2d plot using PCA.
import numpy as np
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
#my data is longer than this but this is a sample :
search_terms = ['computer','usb port', 'phone adaptor']
clicks = [3,2,1]
bounce = [0,0,2]
conversion = [4,1,0]
X = np.array([bounce,conversion,clicks]).T
y = np.array(search_term)
num_clusters = 5
pca = PCA(n_components=2, whiten=True).fit_transform(X)
data2D = pca.transform(X)
km = KMeans(n_clusters=num_clusters, init='k-means++',n_init=10, verbose=1)
km.fit(X_pca)
centers2D = pca.transform(km.cluster_centers_)
plt.scatter( data2D[:,0], data2D[:,1], c=label_color)
This is the error i am getting:
data2D = pca.transform(X)
AttributeError: 'numpy.ndarray' object has no attribute 'transform'
I suppose we cant use pca's fit_transform on a numpy array. What can i do instead?
Thanks