## Weakness of K Means Algorithm

Similar to other algorithm, K-mean clustering has many weaknesses:

• When the numbers of data are not so many, initial grouping will determine the cluster significantly.
• The number of cluster, K, must be determined before hand.
• We never know the real cluster, using the same data, if it is inputted in a different order may produce different cluster if the number of data is a few.
• Sensitive to initial condition. Different initial condition may produce different result of cluster. The algorithm may be trapped in the local optimum.
• We never know which attribute contributes more to the grouping process since we assume that each attribute has the same weight.
• weakness of arithmetic mean is not robust to outliers. Very far data from the centroid may pull the centroid away from the real one.
• The result is circular cluster shape because based on distance .

One way to overcome those weaknesses is to use K-mean clustering only if there are available many data. To overcome outliers problem, we can use median instead of mean.

Some people pointed out that K means clustering cannot be used for other type of data rather than quantitative data. This is not true! See how you can use multivariate data up to n dimensions (even mixed data type) here . The key to use other type of dissimilarity is in the distance matrix.