Talk:Polynomial regression

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated C-class, High-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 High  This article has been rated as High-importance on the importance scale.
WikiProject Mathematics (Rated C-class, Mid-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
C Class
Mid Importance
 Field:  Probability and statistics


This article is quite badly written as it stands, although not as bad as it was. I'm not sure if it's worth saving or not. Michael Hardy (talk) 05:22, 7 April 2009 (UTC)

I think it's worth having a page on this topic. I've attempted a major rewrite. I plan to add a figure or two in the near future. Comments are welcome. Skbkekas (talk) 15:38, 8 April 2009 (UTC)

This looks considerably better. There may be scope to add something from design of experiments, where I think there are results such as, if the region of interest is known and if the order of the polynomial is known, where best to place the values of the x's to get the best estimates of the regression function. It may be more important to extend the discussion to explicitly include polynomials with multivariate x's. Possibly the book by Draper&Smith may suggest things that ought to be mentioned here. Melcombe (talk) 09:03, 9 April 2009 (UTC)

Fine albeit unconstructive comments!

Polynomial regression has its place in history even if more efficient methods exist. Therefore an article on polynomial regression should not be overshadowed by other topics which should merely be linked to and exist separately in their own right.

I am sorry to say that the article in its current state does not appear explain what polynomial regression is, and why it is useful (follow up the Excel commentary). The nomenclature corrections are appreciated, but I cannot fathom why the derivation has been removed. This would have been of interest to any budding mathematician or software programmer. A more useful revision would have been an explanation of why the differentials yield a minima (as opposed to a maxima or inflexion), or the addition of some credible references (I wrote the original article from memory of my university days). Finally, I would like to defend the use of practical analogies and simple English. I think mathematical articles nowadays are way too abstract and risk alienating future generations. 23:42, Gouranga Gupta 15 April 2009 (UTC) —Preceding unsigned comment added by (talk)

Something more to consider:

According to the Oxford Dictionary of Mathematics (Clapham & Nicholson), Multiple Regression is regression based on multiple independent variables, e.g. f(x, y) = 2 * x + 5 * y etc as opposed to a single independent variable e.g. f(x) = 2 * x + 5. Both of these examples are LINEAR. The aforementioned reference would refer to the former equation as an example of "Multiple Linear Regression".

I think Michael Hardy's early argument that polynomial regression is a form of linear regression is questionable. My academic background is Chemical Engineering and not Mathematics (so I’m open to correction), but I believe that it is by clever substitution that one can mould linear regression into a logarithmic type regression etc. This should not distract from the fact that the word linear refers to the fact that highest "Degree" of the independent variable(s) is one. Thus, I concede that my original use of the word "Order" is incorrect. "Order" according to Clapham & Nicholson is generally applied to differential equations, matrices or roots. A polynomial of degree n or nth degree polynomial would constitute better grammar. Gouranga Gupta 23:14 22 April 2009 (UTC). —Preceding unsigned comment added by (talk)

I am absolutely certain that the meaning of the term "linear" in "linear regression" or "linear model" refers to linearity in the unknown parameters (the regression coefficients), not to linearity in the independent variable or variables. This is stated unequivocally in the first chapter of several standard regression texts on my desk (Stapleton, Monahan, Abraham). This is simply a definition and you may view it as being arbitrary, but it is unquestionably the definition that is in universal use. The rationale for the definition is that while it may be extremely important for the interpretation of a fitted linear model whether the independent variables are transformed (e.g. by taking powers of them), it has nothing to do with how a linear model is fit, or how inferences are performed. On the other hand, a model that is non-linear in the parameters requires a completely different set of statistical techniques for fitting and inference. Skbkekas (talk) 03:32, 23 April 2009 (UTC)
Conceded! (I've also just noticed that Excel uses the phrase polynomial order (and not degree)). I think confusion by non-specialists is inevitable, although the "Linear regression" article deals with this. What does one type if searching specifically for practical information on the common types of least squares regression curve fitting, e.g. straight line, a general polynomial, exponential, logarithmic or perhaps even Fourier? Originally I was looking specifically for the polynomial derivation and maybe some pseudo code that would yield the regression coefficients. It didn't seem to exist, so I sought to add it. Are such specific articles beyond the intended scope of Wikipedia? If not then perhaps "Straight line regression", "Polynomial regression", "Multiple regression" etc should exist as separate articles that can focus on their pragmatic implementation and the "Linear regression" article can continue to disambiguate semantics. Gouranga Gupta 16:49 26 April 2009 (UTC). —Preceding unsigned comment added by (talk)

>I think Michael Hardy's early argument that polynomial regression is a form of linear regression is questionable.

No, it isn't, although this point often confuses novices. A mathematical formula is linear or nonlinear in the unknowns. A regression equation Y = beta0 + beta1 * X1 + beta2 * X2 +... has the parameters (the betas) as its unknowns. The form of the relationship between the Y and the Xs is irrelevant. Blaise (talk) 12:54, 31 March 2013 (UTC)


Further to my comments above, I’ve just submitted what I consider to be a relatively simple explanation of why the surface area of a sphere is what it is in the Sphere topic. It may yet be invalidated, but I think it would serve the community well if the common types of regression we take for granted in software could be explained simply and usefully, for posterity if nothing else. Gouranga Gupta (talk) 15:51, 27 April 2009 (UTC)

Solving the polynomial regression[edit]

This section needs more information because right now I haven't the faintest clue how it works. The article says to "set epsilon to 0 and solve the system of linear equations", but it's extremely easy to come up with a system of linear equations with no solution, e.g. let m = 2, n = 3, (x1, y1) = (1, 1), (x2, y2) = (2, 1), (x3, y3) = (2, 2) and there is obviously no straight line connecting these three points. Nevertheless, calculating

yields a0 = 0.5, a1 = 0.5, which, while it does not solve the simultaneous equations, describes a straight line passing through (x1, y1) and exactly between (x2, y2) and (x3, y3), i.e. the correct linear model. What is this, magic? (talk) 13:05, 5 January 2011 (UTC)

Agreed, I have made some edits to that section that hopefully make it more clear. Skbkekas (talk) 22:53, 6 January 2011 (UTC)

Finding slope of polynomial regression[edit]

As of now, the entry states that "...when the temperature is increased from x to x + 1 units, the expected yield changes by a1 + a2 + 2a2x."

Shouldn't it be a1 + 2a2x? The derivative of any polynomial function is a1+2a2x. Or am I missing something here? — Preceding unsigned comment added by Joshtk76 (talkcontribs) 23:47, 25 March 2012 (UTC)

(Mathematically) agree - I just corrected the article. Billy Pilgrim (talk) 12:01, 29 January 2013 (UTC)


Aren't interactions a form of multivariate polynomial regression? (talk) 06:02, 22 February 2015 (UTC)