10.5 - Other Partitioning Methods

Printer-friendly versionPrinter-friendly version

There are many other ways to cluster data.  Some of the popular ones are summarized briefly below.

1. Self Organizing Maps (SOM): adds an underlying 'topology', a neighboring structure on the lattice, that relates cluster centroids to one another. Kohonen (1997), Tamayo et. Al. (1999).

2. Fuzzy K-means: Instead of partitioning the data into clusters, these methods provide, for each item and each cluster, the probability that the item is a member of the cluster. This provides fuzzy boundaries between clusters and probabilities associated with centroids. This is an interesting method but is harder to visualize. Gash and Eisen (2002).

3. Mixture Based Clustering: implemented through an EM (Expectation-Maximization) algorithm. This provides 'soft partitioning', and allows for 'modeling' of cluster centroids and shapes. The idea is that each cluster should have only one center, with the items in the cluster distributed around the center.  For example, within each cluster, the data might be multivariate Normal, in which case the density of the data should have a peak at the center and elliptical contours. Mixture based  clustering looks at the whole distribution and tries to figure out how many peaks there are and how many pieces you can decompose it into. It is similar to fuzzy K-means methods because for each item it produces a probability of cluster membership for each cluster. Yeung et al. (2001), McLachlen et al. (2002).