Click here to purchase the complete E-book of this tutorial
Linkages Between Objects
The rule of hierarchical clustering lie on how objects should be grouped into clusters. Given a distance matrix, linkages between objects can be computed through a criterion to compute distance between groups. Most common & basic criteria are
Single Linkage: minimum distance criterion
Complete Linkage: maximum distance criterion
Average Group: average distance criterion
Ward: minimize variance of the merge cluster
Jain and Dubes (1988) showed general formula that first proposed by Lance & William (1967) to include most of the most commonly referenced hierarchical clustering called SAHN (sequential, agglomerative, hierarchical and nonoverlapping) clustering method. Distance between existing cluster k with objects and newly formed cluster (r, s) is given as
The values of the parameters are given in the table below.
Clustering method |
|
|
|
|
Single Link |
1/2 |
1/2 |
0 |
-1/2 |
Complete Link |
1/2 |
1/2 |
0 |
1/2 |
Unweighted pair group method average (UPGMA) |
|
|
0 |
0 |
weighted pair group method average (WPGMA) |
1/2 |
1/2 |
0 |
0 |
unweighted pair group method centroid (UPGMC) |
|
|
|
0 |
weighted pair group method centroid (WPGMC) |
1/2 |
1/2 |
-1/4 |
0 |
Ward's method |
|
|
|
0 |
(After Jain & Dubes, 1988)
In the
next section
, I will show how to compute hierarchical clustering using Single Linkage Hierarchical Clustering. Other methods of linkages have similar computation with only different in the computational formula.
Click here to purchase the complete E-book of this tutorial
Do you have question regarding this Clustering tutorial? Ask your question here
This tutorial is copyrighted .
Preferable reference for this tutorial is
Teknomo, Kardi. (2009) Hierarchical Clustering Tutorial. http://people.revoledu.com/kardi/tutorial/clustering/