Jump to content

UPGMA

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by AdamRetchless (talk | contribs) at 20:47, 6 July 2007 (changed order and emphasis). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

UPGMA (Unweighted Pair Group Method with Arithmetic mean) is a simple bottom-up data clustering method used in bioinformatics for the creation of phylogenetic trees. UPGMA assumes a constant rate of evolution (molecular clock hypothesis), and is not a well-regarded method for inferring phylogenetic trees unless this assumption has been tested and justified for the data set being used. UPGMA was initially designed for use in protein electrophoresis studies, but is currently most often used to produce guide trees for more sophisticated phylogenetic reconstruction algorithms.

The input data is a collection of objects with their pairwise distances and the output is a rooted tree (dendrogram).

Initially, each object is in its own cluster. At each step, the nearest 2 clusters are combined into a higher-level cluster. The distance between any 2 clusters A and B is taken to be the average of all distances between pairs of objects "a" in A and "b" in B.