Variance-stabilizing transformation

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In applied statistics, a variance-stabilizing transformation is a data transformation that is specifically chosen either to simplify considerations in graphical exploratory data analysis or to allow the application of simple regression-based or analysis of variance techniques.[1]

The aim behind the choice of a variance-stabilizing transformation is to find a simple function ƒ to apply to values x in a data set to create new values y = ƒ(x) such that the variability of the values y is not related to their mean value. For example, suppose that the values x are realizations from different Poisson distributions: i.e. the distributions each have different mean values μ. Then, because for the Poisson distribution the variance is identical to the mean, the variance varies with the mean. However, if the simple variance-stabilizing transformation

y=\sqrt{x} \,

is applied, the sampling variance associated with observation will be nearly constant: see Anscombe transform for details and some alternative transformations.

While variance-stabilizing transformations are well known for certain parametric families of distributions, such as the Poisson and the binomial distribution, some types of data analysis proceed more empirically: for example by searching among power transformations to find a suitable fixed transformation. Alternatively, if data analysis suggests a functional form for the relation between variance and mean, this can be used to deduce a variance-stabilizing transformation.[2] Thus if, for a mean μ,

\operatorname{var}(X)=g(\mu), \,

a suitable basis for a variance stabilizing transformation would be

y=\int^x \frac{1}{\sqrt{g(v)}} \, dv,

where the arbitrary constant of integration can be chosen for convenience.

Relationship to the delta method[edit]

References[edit]

  1. ^ Everitt, B.S. (2002) The Cambridge Dictionary of Statistics (2nd Edition), CUP. ISBN 0-521-81099-X
  2. ^ Dodge, Y. (2003) The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-920613-9