Here is an example to show you how this algorithm works. In this case we have 11 points consisting of single numbers. Every point is assigned a label. Visually this is what these different approaches might look like…
Agglomerative clustering of a data set containing 100 points into 9 clusters.
With single linkage, below, the result is often not appealing. For instance, if we look at the purple square at the lower left area, a single point is a cluster, and there are other clusters comprising single points. And then all of the red circle points are grouped into one big cluster. Why does this happen?
Single linkage is also sometimes called 'friends of friends' linkage method. The single linkage method always picks the minimum distance when it updates. Therefore, two points might be considered close under the single linkage scheme if they can be connected by a chain of points with small distances between any two consecutive points down the chain. The distance between the two end points of the chain is only as big as the longest link along the chain. The number of links along the chain does not matter.
Complete linkage, below, gives us very different result. It tends to generate clusters that are not very well separated.
Average linkage results:
Ward's clustering, below, tends to generate results similar to k-means. It is kind of a greedy version of k-means or a bottom-up version of k-means because the optimization criterion of k-means is the same as the criterion used for picking clusters to merge in Ward’s clustering. K-means operates in a top-down fashion whereas Ward's clustering operates in a bottom up fashion.
Links:
[1] javascript:popup_window( 'https://www.youtube.com/embed/ZgYB4RvMvZw?rel=0', 'clustering', 560, 315 );