Talk:Ordinary least squares
|Ordinary least squares has been listed as a level-4 vital article in Mathematics. If you can improve it, please do. This article has been rated as B-Class.|
|WikiProject Statistics||(Rated B-class, Top-importance)|
|WikiProject Mathematics||(Rated B-class, Low-importance)|
- 1 Regression articles discussion July 2009
- 2 OLS
- 3 The variance of the estimator
- 4 Comment from main article on assumptions
- 5 Why does "linear least-squares" redirect here?
- 6 Hypothesis Testing
- 7 Linear Least Squares
- 8 Vertical or Euclidian Distances?
- 9 Example with real data
- 10 On multicollinearity
- 11 Estimation
- 12 Alternative Derivations: Geometric Approach
- 13 Too technical? Should it be rewritten "one level down"?
Regression articles discussion July 2009
A discussion of content overlap of some regression-related articles has been started at Talk:Linear least squares#Merger proposal but it isn't really just a question of merging and no actual merge proposal has been made. Melcombe (talk) 11:37, 14 July 2009 (UTC)
Don't merge them, this article is OLS. To put it simple, this article is only for OLS, so don't write much about GLS etc in this article. But we can go deeper for OLS in this article. I want to delete much material is not OLS, those material can be put in the linear regression article. For example, the section 1 should be simplified. Jackzhp (talk) 16:25, 25 March 2010 (UTC)
The variance of the estimator
- According to my calculations, the variance is equal to
- where mii is the i-th diagonal element of the annihilator matrix M, and γ2 is the kurtosis of the distribution of the error terms. When the kurtosis is positive we can obtain an upper bound from the fact that , and the sum of mii’s is the trace of M which is n − p:
- // stpasha » 00:36, 25 April 2010 (UTC)
Comment from main article on assumptions
I'm moving this from the main article, it was included as a comment in the section on assumptions:
- Can we just replace this section with the following one line?
- Please discuss with me in the discussion page. Let's keep material only related to OLS, anything else should be deleted. If you want, please move it to linear regression article.
- correlation between data points can be discussed, but not in this section. Given , we can clear see this.
- Identifiability can be discussed, but should be in the estimation section
I disagree with this proposal. Some written explanation is much more useful than a single mathematical expression. The current text is not unduly digressive. Skbkekas (talk) 22:29, 25 March 2010 (UTC)
Why does "linear least-squares" redirect here?
This page does not make any sense to someone who is just interested in the general problem , since this page seems extremely application specific. Did some error occur when this strange redirection happened? —Preceding unsigned comment added by Short rai (talk • contribs) 09:54, 23 May 2010 (UTC)
- See Ordinary least squares#Geometric approach. // stpasha » 19:07, 23 May 2010 (UTC)
- But this is a major subject in virtually every subfield of applied mathematics, and you refer everyone who is not in statistics to one tiny paragraph called "geometric approach"? I think there recently was a redirect and/or merge process going on with other articles in about the topic, and there seems to have been a page called "linear least squares" earlier, but that page is impossible to find now.Short rai (talk) 23:14, 23 May 2010 (UTC)
- Ya, there used to be a backlink in the see also section, I have restored it now. And by the way, the OLS regression and the problem of minimization of the norm of Ax−b are exactly the same problems, only written in different notations. The difference is that statisticians use X and y for known quantities, and β for unknown. Mathematicians use those symbols in the opposite way. // stpasha » 03:53, 24 May 2010 (UTC)
Linear Least Squares
FYI, there is a discussion on the usage of Talk:Numerical methods for linear least squares.which currently redirects here, but was previously another topic, at
Vertical or Euclidian Distances?
The article states that OLS minimizes the sum of squared distances from the points to the estimated regression line. But we are taught in standard (Euclidian) geometry that the distance between a point and a line is defined as the length of the perpindicular line segment connecting the two. This is not what OLS minimizes. Rather, it minimizes the vertical distance between the points and the line. Shouldn't the article say as much? —Preceding unsigned comment added by 126.96.36.199 (talk) 17:00, 2 November 2010 (UTC)
- I've changed that sentence, so that it says "vertical distances" now. // stpasha » 20:53, 2 November 2010 (UTC)
- I have a problem where I do want to minimize the sum of squared Euclidean distances from a point to a set of given straight lines in the plane. Can anyone give a reference or some keywords? (The problem occurs in surveying, when many observers at known locations can see the same point at an unknown location. Each observer can measure its bearing to the target point. This gives a set of lines that ideally should intersect at the target point. But measurement errors gives an overdetermined problem if there are more than two observers.) Mikrit (talk) 15:41, 22 November 2010 (UTC)
Example with real data
Are the calculations of the Akaike criterion and Schwarz criterion correct here? I know that there are many "different" forms of the AIC and SIC - but I just can't figure out how these were calculated. Certainly they seem inconsistent with the forms that are linked-to in the description that follows the calculated values.—Preceding unsigned comment added by 188.8.131.52 (talk) 05:32, 30 November 2010 (UTC)
I'll try again
Way back in 2008 I came across this example calculation, and looked at the plot of (x,y) and noticed an odd cadence in the positioning. Some simple inspection soon showed that the original height data were in terms of inches, and whoever converted them to metric bungled the job. The conversion factor is 2.54cm to an inch and rounding to the nearest centimetre is not a good idea. This makes a visible difference and considerably changes the results of an attempt at a quadratic fit. It doesn't matter what statistical packages might be used, to whatever height of sophistication, if the input data are wrongly prepared. The saving grace is the presence of the plot of the data, but, that plot has to be looked at not just gazed at with vague approbation. In the various reorganisations, my report of this problem has been lost, and the editors ignored its content. The error remains, so I'll try again.
Height^2 Height Const. 61.96033 -143.162 128.8128 Improper rounding of inches to whole cm. 58.5046 -131.5076 119.0205 Proper conversion, no rounding.
The original incorrectly-converted plot can be reproduced, but here is a plot of the correctly-converted heights, with a quadratic fit. Notice the now-regular spacing of the x-values, without cadence.
For the incorrectly-concerted heights, the residuals are
Whereas for the correctly-converted height data, the residuals are much smaller. (Note the vertical scale)
And indeed, this pattern of residuals rather suggests a higher-order fit attempt, such as with a cubic.
But assessing the merit of this escalation would be helped by the use of some more sophisticated analysis, such as might be offered by the various fancy packages, if fed correct data.
Later, it occurred to me to consider whether the weights might have been given in pounds. The results were odd in another way. Using the conversion 1KG = 2.20462234 lbs used in the USA, the weights are
115.1033 117.1095 120.1078 123.1061 126.1044 129.1247 132.123 135.1213 139.1337 142.132 146.1224 150.1348 154.1472 159.1517 164.1562 114.862 116.864 119.856 122.848 125.84 128.854 131.846 134.838 138.842 141.834 145.816 149.82 153.824 158.818 163.812
Possible error in formula for standard error for coefficients
It seems to me that the 1/n should not be included in the formula for the standard errors for each coefficient. With 1/n, the values calculated in this example are not produced. Removing it generates the values given in the example. Would someone more knowledgeable in this subject examine this and correct the formula, if necessary? — Preceding unsigned comment added by 184.108.40.206 (talk) 17:37, 22 May 2012 (UTC)
Multicollinearity means high level of correlation between variables. OLS can handle this fine, it just needs more data to do it. However, in the article, "multicollinearity" is being used to mean perfect collinearity, ie. the data matrix does not have full column rank. This is confusing. I propose we stop using multicollinearity to mean lack of full rank, and just say "not full rank". —Preceding unsigned comment added by 220.127.116.11 (talk) 04:21, 10 March 2011 (UTC)
In the section "Simple regression model"; It is not true that , this only holds for the true parameters and not the estimator. Rather, is equal to the sample covariance over sample variance. Maybe use a hat over Cov and Var to signify this if you want the relation to be stressed.
Also, in my opinion a lot of time is being devoted to the use of the annihilator-matrix. It does't seem necessary to introduce the extra notation unless one wants to go into the Frisch-Waugh-Lovell theorem and this has its own separate page. — Preceding unsigned comment added by Superpronker (talk • contribs) 06:46, 1 June 2011 (UTC)
I believe that a clarification on notation would help immensely. The regressor values for 1 observation is referred to as the *column* vector . However, in the design matrix of regressor values, the values for an observation occupy a *row*. Hence, it is easy to fall into the trap of thinking of as a row vector, leading to confusion.Craniator (talk) 05:34, 3 May 2015 (UTC)
Alternative Derivations: Geometric Approach
In the illustration of orthogonal projection, it would be helpful to clarify that refers to a column in the data matrix, thus clearly distinguishing it from for the set of regressor values from one observation.Craniator (talk) 06:01, 3 May 2015 (UTC)
Too technical? Should it be rewritten "one level down"?
I realize this article has been rated B-class, but I wonder if it is too technical for someone who does not already understand OLS. OLS is often studied in undergrad stats classes. Therefore, in the spirit of writing "one level down" (see: WP:UPFRONT), this article should ideally include an extended intro that is much more comprehensible to someone with some relatively advanced high school math training (say, through a year of high school calculus, but without linear algebra).
As currently written, almost all of the article is incomprehensible to someone who doesn't understand linear algebra. There are many ways of introducing OLS without linear algebra, so could such non-technical, intuitive approaches be put at the top of this article, leaving the more technical, formal math stuff for the bottom? I hesitate to be so bold in editing, since this is a very important article, but I think it is currently way too technical.Aroundthewayboy (talk) 04:57, 26 July 2015 (UTC)