Controlling for a variable
||It has been suggested that this article be merged with treatment and control groups. (Discuss) Proposed since May 2012.|
In a scientific experiment measuring the effect of one independent variables on a dependent variable, controlling for a variable is a method to reduce the confounding effect of variations in a third variable that may also affect the value of the dependent variable. For example, in an experiment to determine the effect of nutrition (the independent variable) on organism growth (the dependent variable), the age of the organism (the third variable) needs to be controlled for, since the effect may also depend on the age of an individual organism.
To a certain extent, systematic selection bias can be avoided, and the further confounding effect can be reduced, by a random assignment of individuals to the experimental group and the control group, but controlling tends to reduce the experimental error further.
The essence of the method is to ensure that comparisons between the control group and the experimental group are only made for groups or subgroups for which the variable to be controlled has (as much as possible) the same statistical distribution. A common way to achieve this is to partition the groups into subgroups whose members have (nearly) the same value for the controlled variable.
Controlling for a variable is also a term used in statistical data analysis when inferences may need to be made for the relationships within one set of variables given that some of these relationships may spuriously reflect relationships to variables in another set. This is broadly equivalent to conditioning on the variables in the second set, although in techniques related to linear regression only linear relations are taken into account. Such analyses may be described as "controlling for variable X", or "controlling for the variations in X". Controlling in this sense is performed by including in the regression not only the explanatory variables of interest but also the extraneous variables. The failure to do so results in omitted-variable bias.
|This statistics-related article is a stub. You can help Wikipedia by expanding it.|