This is my first time implementing a Machine Learning Algorithm in Python. I tried implementing K-Means using Python and Sklearn for this dataset.
from sklearn.cluster import KMeans
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
# Importing the dataset
data = pd.read_csv('dataset.csv')
print("Input Data and Shape")
print(data.shape)
data.head()
# Getting the values and plotting it
f1 = data['Area'].values
f2 = data['perimeter'].values
f3 = data['Compactness'].values
f4 = data['length_kernel'].values
f5 = data['width_kernel'].values
f6 = data['asymmetry'].values
f7 = data['length_kernel_groove'].values
X = np.array(list(zip(f1,f2,f3,f4,f5,f6,f7)))
# Number of clusters
kmeans = KMeans(n_clusters=7)
kmeans = kmeans.fit(X)
# Getting the cluster labels
labels = kmeans.predict(X)
# Centroid values
centroids = kmeans.cluster_centers_
plt.scatter(X[:,0], X[:,1],cmap='rainbow')
plt.scatter(centroids[:,0], centroids[:1], color="black", marker='*')
plt.show()
The graph doesn't seem to plot the data correctly. How can I debug this issue?
