How to plot clusters in python?

Question

I am using python sklearn.cluster to do clustering. I have 61 data and each data is of dimension 26. Original data:

UserID  Communication_dur   Lifestyle_dur   Music & Audio_dur   Others_dur  Personnalisation_dur    Phone_and_SMS_dur   Photography_dur Productivity_dur    Social_Media_dur    System_tools_dur    ... Music & Audio_Freq  Others_Freq Personnalisation_Freq   Phone_and_SMS_Freq  Photography_Freq    Productivity_Freq   Social_Media_Freq   System_tools_Freq   Video players & Editors_Freq    Weather_Freq
1   63  219 9   10  99  42  36  30  76  20  ... 2   1   11  5   3   3   9   1   4   8
2   9   0   0   6   78  0   32  4   15  3   ... 0   2   4   0   2   1   2   1   0   0


from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA 

Sc = StandardScaler()
X = Sc.fit_transform(df)

I have applied PCA to a dataframe in order to plot clusters based on K-means.

pca = PCA(3) 
pca.fit(X) 
pca_data = pd.DataFrame(pca.transform(X)) 
print(pca_data.head())

Data :

kmeans_pca=KMeans(n_clusters=10,init="k-means++",random_state=42)
kmeans_pca.fit (pca_data)

Now I want to plot the resultant clusters how can i do ?

can you give some example data? (minimal reproducible example) — Gwang-Jin Kim
– Gwang-Jin Kim, Commented Feb 12, 2021 at 11:33

Himanshu · Accepted Answer · 2021-02-12 12:02:42Z

4

Haven't tested but can visualize with code like below:

import matplotlib.pyplot as plt
import seaborn as sns

def show_clusters(data, labels):
     palette = sns.color_palette('hls', n_colors=len(set(labels)))
     sns.scatterplot(x=data.iloc[:, 0], y=data.iloc[:, 1], hue=labels, palette=palette)
     plt.axis('off')
     plt.show()

Then call the function by passing PCA data and K-means cluster labels:

show_clusters(pca_data, kmeans_pca.labels_)

Output:

edited Feb 12, 2021 at 12:02

answered Feb 12, 2021 at 11:48

Himanshu

6761 gold badge8 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

ab20225 Over a year ago

thank you for yor answer! This error raise: TypeError: '(slice(None, None, None), 0)' is an invalid key

Himanshu Over a year ago

fixed x=data[:, 0] to x=data.iloc[:, 0] and similarly for y, as your data type is not a numpy array but a pandas dataframe, also this is for 2D visualization(so PCA components should be 2 for this case).

Collectives™ on Stack Overflow

How to plot clusters in python?

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related