Principal component regression
| Regression analysis |
|---|
| Models |
|
| Estimation |
| Background |
In statistics, principal component regression (PCR) is a regression analysis that uses principal component analysis when estimating regression coefficients. It is a procedure used to overcome problems which arise when the exploratory variables are close to being collinear.[1]
In PCR instead of regressing the dependent variable on the independent variables directly, the principal components of the independent variables are used. One typically only uses a subset of the principal components in the regression, making a kind of regularized estimation.
Often the principal components with the highest variance are selected. However, the low-variance principal components may also be important, — in some cases even more important.[2]
The principle [edit]
PCR (principal components regression) is a regression method that can be divided into three steps:[citation needed]
- The first step is to run a principal components analysis on the table of the explanatory variables,
- The second step is to run an ordinary least squares regression (linear regression) on the selected components: the factors that are most correlated with the dependent variable will be selected
- Finally the parameters of the model are computed for the selected explanatory variables.
See also [edit]
- Canonical correlation
- Deming regression
- Multilinear subspace learning
- Partial least squares regression
- Principal component analysis
- Total sum of squares
References [edit]
- ^ Dodge, Y. (2003) The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-920613-9
- ^ Ian T. Jolliffe (1982). "A note on the Use of Principal Components in Regression". Journal of the Royal Statistical Society, Series C 31 (3): 300–303. doi:10.2307/2348005. JSTOR 2348005.
- R. Kramer, Chemometric Techniques for Quantitative Analysis, (1998) Marcel-Dekker, ISBN 0-8247-0198-4.