Least trimmed squares

Least trimmed squares (LTS), or least trimmed sum of squares, is a robust statistical method that fits a function to a set of data whilst not being unduly affected by the presence of outliers. It is one of a number of methods for robust regression.

Description of method

Instead of the standard least squares method, which minimises the sum of squared residuals over n points, the LTS method attempts to minimise the sum of squared residuals over a subset, ${\displaystyle k}$, of those points. The unused ${\displaystyle n-k}$ points do not influence the fit.

In a standard least squares problem, the estimated parameter values β are defined to be those values that minimise the objective function S(β) of squared residuals:

${\displaystyle S=\sum _{i=1}^{n}r_{i}(\beta )^{2},}$

where the residuals are defined as the differences between the values of the dependent variables (observations) and the model values:

${\displaystyle r_{i}(\beta )=y_{i}-f(x_{i},\beta ),}$

and where n is the overall number of data points. For a least trimmed squares analysis, this objective function is replaced by one constructed in the following way. For a fixed value of β, let ${\displaystyle r_{(j)}(\beta )}$ denote the set of ordered absolute values of the residuals (in increasing order of absolute value). In this notation, the standard sum of squares function is

${\displaystyle S(\beta )=\sum _{j=1}^{n}r_{(j)}(\beta )^{2},}$

while the objective function for LTS is

${\displaystyle S_{k}(\beta )=\sum _{j=1}^{k}r_{(j)}(\beta )^{2}.}$

Computational considerations

Because this method is binary, in that points are either included or excluded, no closed-form solution exists. As a result, methods for finding the LTS solution sift through combinations of the data, attempting to find the k subset that yields the lowest sum of squared residuals. Methods exist for low n that will find the exact solution; however, as n rises, the number of combinations grows rapidly, thus yielding methods that attempt to find approximate (but generally sufficient) solutions.

References

• Rousseeuw, P. J. (1984). "Least Median of Squares Regression". Journal of the American Statistical Association. 79: 871–880. doi:10.1080/01621459.1984.10477105. JSTOR 2288718.
• Rousseeuw, P. J.; Leroy, A. M. (2005) [1987]. Robust Regression and Outlier Detection. Wiley. doi:10.1002/0471725382. ISBN 978-0-471-85233-9.
• Li, L. M. (2005). "An algorithm for computing exact least-trimmed squares estimate of simple linear regression with constraints". Computational Statistics & Data Analysis. 48 (4): 717–734. doi:10.1016/j.csda.2004.04.003.
• Atkinson, A. C.; Cheng, T.-C. (1999). "Computing least trimmed squares regression with the forward search". Statistics and Computing. 9 (4): 251–263. doi:10.1023/A:1008942604045.
• Jung, Kang-Mo (2007). "Least Trimmed Squares Estimator in the Errors-in-Variables Model". Journal of Applied Statistics. 34 (3): 331–338. doi:10.1080/02664760601004973.