K-means clustering is one of the simplest and popular unsupervised machine learning algorithms.
Here unsupervised mean, its interference with datasets without referring to known, or labeled outcomes.
First, we need to created random clusters and these are points to the centroid. After this find the distance from each centroid until we will not get the correct result. It works with using repeating some numbers of iterations.
Steps to do it
First, import all related libraries like skit-learn and use some random data to illustrate a K-means clustering simple explanation.
Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import Kmean
%matplotlib inline
Generate random data
Then after this, we will generate random data.
-------------------------------------------------------------
center_1 = np.array([1,1])
center_2 = np.array([2,8])
center_3 = np.array([10,8])
X = 2+2.5 *np.random.rand(200,2) + center_1
X1 = 2+2 *np.random.rand(200,2) + center_2
X2 = 1+2 *np.random.rand(200,2) + center_3
data = np.concatenate((X, X1, X3), axis = 0)
plt.scatter(X[:,0], X[:,1], s=7, c='b', label = 'Cluster 1')
plt.scatter(X1[:,0], X1[:,1], s=7, c='r')
plt.scatter(X2[:,0], X2[:,1], s=7, c='k')
plt.show()
---------------------------------------------------------------
It draws like this:
Then after this fit into the Kmean Algorithms:
----------------------------------------------
from sklearn.cluster import
KMeansKmean = KMeans(n_clusters=2)
Kmean.fit(data)
----------------------------------------------
Find the centroid of each cluster
---------------------------------------------
Kmean.cluster_centers_
---------------------------------------------
Outputs look like that:
array([[ 4.25116126, 4.23343225],
[ 4.89833143, 10.92433901],
[ 2.029634 , 2.10565485]])
Use these center points to draw on clusters as below code:
-------------------------------------------------------
plt.scatter(X[:,0], X[:,1], s=7, c='b', label = 'Cluster 1')
plt.scatter(X1[:,0], X1[:,1], s=7, c='r')
plt.scatter(X2[:,0], X2[:,1], s=7, c='k')
color = ['red','green','yellow']
plt.scatter(2.029634, 2.10565485, s=200, c=color[0], marker='*', label='centroid 1')
plt.legend()
plt.show()
---------------------------------------------------------
I hope this blog is more helpful in creating clusters and finding centers of each cluster and then plot each clusters with the centroids.
Thanks for reading this blog if you need any type of help related to the python machine learning then contact here or comments below so that we can solve our issue and give and reply.
Comments