| |||||||||||||||||
![]() |
![]() |
![]() |
|||||||||||||||
|
What is K-Mean Clustering?
K means clustering algorithm was developed by J. MacQueen (1967) and then by J. A. Hartigan and M. A. Wong around 1975. Simply speaking k-means clustering is an algorithm to classify or to group your objects based on attributes/features into K number of group. K is positive integer number. The grouping is done by minimizing the sum of squares of distances between data and the corresponding cluster centroid. Thus the purpose of K-mean clustering is to classify the data. Example: Suppose we have 4 objects as your training data points and each object have 2 attributes. Each attribute represents coordinate of the object.
We also know before hand that these objects belong to two groups of medicine (cluster 1 and cluster 2). The problem now is to determine which medicines belong to cluster 1 and which medicines belong to the other cluster. Click here for numerical example (manual calculation) of the k-mean clustering. See how the k-mean algorithm works (download free code in VB) For distinction between supervised learning and unsupervised learning, click here.
Note: K means algorithm is one of the simplest partition clustering method. More advanced algorithms related to k means are Expected Maximization (EM) algorithm especially Gaussian Mixture, Self-Organization Map (SOM) from Kohonen, Learning Vector Quantization (LVQ). To overcome weakness of k means, several algorithms had been proposed such as k medoids, fuzzy c mean and k mode. Check the resources of k means for further study. |
|||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||
© 2006 Kardi Teknomo. All Rights Reserved. Designed by CNV Media |
||||||||||||||||||||||||||||||||