Kardi Teknomo
Kardi Teknomo Kardi Teknomo Kardi Teknomo
   
 
Research
Publications
Tutorials
Resume
Personal
Resources
Contact

Weakness of K-Mean Clustering

 

<Contents | Previous | Next>

Similar to other algorithm, K-mean clustering has many weaknesses:

  • When the numbers of data are not so many, initial grouping will determine the cluster significantly.
  • The number of cluster, K, must be determined before hand.
  • We never know the real cluster, using the same data, if it is inputted in a different order may produce different cluster if the number of data is a few.
  • Sensitive to initial condition. Different initial condition may produce different result of cluster. The algorithm may be trapped in the local optimum.
  • We never know which attribute contributes more to the grouping process since we assume that each attribute has the same weight.
  • weakness of arithmetic mean is not robust to outliers. Very far data from the centroid may pull the centroid away from the real one.
  • The result is circular cluster shape because based on distance.

One way to overcome those weaknesses is to use K-mean clustering only if there are available many data. To overcome outliers problem, we can use median instead of mean.

Some people pointed out that K means clustering cannot be used for other type of data rather than quantitative data. This is not true! See how you can use multivariate data up to n dimensions (even mixed data type) here. The key to use other type of dissimilarity is in the distance matrix.

<Contents | Previous | Next>

 

 

 
© 2006 Kardi Teknomo. All Rights Reserved.
Designed by CNV Media