Least trimmed squares

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Least trimmed squares (LTS), or least trimmed sum of squares, is a robust statistical method that fits a function to a set of data whilst not being unduly affected by the presence of outliers. It is one of a number of methods for robust regression.

Description of method[edit]

Instead of the standard least squares method, which minimises the sum of squared residuals over n points, the LTS method attempts to minimise the sum of squared residuals over a subset, k, of those points. The n-k points which are not used do not influence the fit.

In a standard least squares problem, the estimated parameter values, β, are defined to be those values that minimise the objective function, S(β), of squared residuals

S=\sum_{i=1}^{n}{r_i(\beta)}^2,

where the residuals are defined as the differences between the values of the dependent variables (observations) and the model values

r_i(\beta)= y_i - f(x_i, \beta),

and where n is the overall number of data points. For a least trimmed squares analysis, this objective function is replaced by one constructed in the following way. For a fixed value of β, let  r_{(j)}(\beta) denote the set of ordered absolute values of the residuals (in increasing order of absolute value). In this notation, the standard sum of squares function is

S(\beta)=\sum_{j=1}^n |r_{(j)}(\beta)|^2,

while the objective function for LTS is

S_k(\beta)=\sum_{j=1}^k |r_{(j)}(\beta)|^2.

Computational considerations[edit]

Because this method is binary, in that points are either included or excluded, no closed form solution exists. As a result, methods which try to find a LTS solution through a problem sift through combinations of the data, attempting to find the k subset which yields the lowest sum of squared residuals. Methods exist for low n which will find the exact solution, however as n rises, the number of combinations grows rapidly, thus yielding methods which attempt to find approximate (but generally sufficient) solutions.

References[edit]

  • Rousseeuw, P. J. (1984) "Least Median of Squares Regression" Journal of the American Statistical Association, 79, 871–880. JSTOR 2288718
  • Rousseeuw, P. J., Leroy A.M. (1987) Robust Regression and Outlier Detection, Wiley. ISBN 978-0-471-85233-9 (Published online 2005 doi: 10.1002/0471725382 )
  • Li, L.M. (2005) "An algorithm for computing exact least-trimmed squares estimate of simple linear regression with constraints", Computational Statistics & Data Analysis, 48 (4), 717–734. doi: 10.1016/j.csda.2004.04.003.
  • Atkinson, A.C., Cheng, T.-C. (1999) "Computing least trimmed squares regression with the forward search", Statistics and Computing, 9 (4), 251–263. doi: 10.1023/A:1008942604045
  • Jung, Kang-Mo (2007) "Least Trimmed Squares Estimator in the Errors-in-Variables Model", Journal of Applied Statistics, 34 (3), 331–338. doi: 10.1080/02664760601004973