Talk:Nonlinear regression
Statistics Start‑class Top‑importance | ||||||||||
|
This article is substantially duplicated by a piece in an external publication. Since the external publication copied Wikipedia rather than the reverse, please do not flag this article as a copyright violation of the following source:
|
Questions
I'm moving some questions from the article page to the talk page.
- --is there a rigorous theory of when analytic solutions are possible?
- --are there some models that have no analytic solutions, but have iterative solutions that are guarenteed to converge to the global optimal?
These can be addressed in the article text at some point. Wile E. Heresiarch 15:15, 13 Oct 2004 (UTC)
The clever graphic showing polynomial regression is inappropriate, since polynomial regression is a special case of linear regression, not nonlinear regression. A model is linear if the unknowns are a linear function of the knowns. In this case the Xs and Y are the knowns and the betas are the unknowns, so having powers of X in the predictors makes no difference. It is still linear. Blaise 10:19, 26 March 2006 (UTC)
- Good lord, what an unfortunate term then. It really shouldn't be called "linear" if it isn't simply meant to imply a linear relationship between x and y. If I hadn't chanced upon these Wikipedia articles, I'd continue to have no idea about these peculiar term usages, which I had thought I knew for 25 years. 207.189.230.42 (talk) 08:05, 25 May 2009 (UTC)
Some Problems with the Material added, on Monte Carlo
The use of Monte Carlo deserves consideration. However, as it seems to me, the material just added has several problems. In essence, this is kind of jumping the gun on full development of the article.
1. There is not yet, and needs to be, a section on inference for nonlinear regression. Statistical inference is what distinguishes nonlinear regression from curve fitting. Most procedures from linear regression have analogues in nonlinear regression. The mention of Monte Carlo would belong in that section. 2. I think you are talking about parametric bootstrap. Why not mention nonparametric bootstrap (ordinarily, resampling). 3. You have put in an unusual procedure when as yet there is no mention of the standard errors available in standard nonlinear regression software. 4. The title is not correct. The material suggested is about evaluating error, not about parameter estimation. I don't think Monte Carlo is used for parameter estimation. 5. The use of Monte Carlo simulation as described could be considered for many statistical models. The material is not specific to nonlinear regression. 6. Use of Monte Carlo to evaluate sampling error is an important procedure in practice. I would say we do need to review other articles to see that this is covered. For example, since such materials may be used by government statisticians, it is a service to the public to provide some material.
If these points are not addressed, I will most likely take a shot at them at some point. Dfarrar 21:34, 20 March 2007 (UTC)
Major revision
This article has been subject to a major revision which brings it into line with regression analysis and linear regression. The section on Monte Carlo has been removed, as it is wholly inappropriate; it has been replaced by a section on parameter statistics. Petergans (talk) 16:48, 23 February 2008 (UTC)
Linear Transformation vs. Linearization
The proper term for moving a function to a domain where it is linear is a linear transformation. Linearization almost always refers to approximating a function as linear for some bounded range. This is presented in the note:
"Linearization" as used here is not to be confused with the local linearization involved in standard algorithms such as the Gauss-Newton algorithm. Similarly, the methodology of generalized linear models does not involve linearization for parameter estimation.
I'm removing the note as it is no longer needed after the correction. I agree the note was very relevant and important when the term linearization was used.
Linear regression via Linear transform vs. non-linear regression
Someone had suggested that shifting a problem into a linear domain is unnecessary and not recommended. I would ask the author of that section to provide some basis for his assertion beyond referring to the linear transformation section which indicates that its fair as long as proper consideration is given to errors. Certain problems, where in datasets are very large or where time intervals are very short, such as in a feedback control system, can only be practically solved by linear regression. Proper weighting of data points can compensate for the transform and yield theoretically optimal results. —Preceding unsigned comment added by 198.123.51.205 (talk) 22:58, 13 June 2008 (UTC)
- It is not necessary because there are so many nonlinear programs available, including the use of SOLVER in EXCEL. Once the system has been set up the nonlinear refinement is just as easy as the linear one. The size of the dataset is immaterial. The time required is hardly an issue with modern computers.
- It is not reccommended because transforming the weights is subject to error as a linear transformation has to be assumed. The Lineweaver-Burk example cited later shows how dangerous the transformation can be. That example is not just of academic interest: enzyme kinetics are used in hospital path labs on samples from real patients. Petergans (talk) 07:49, 16 June 2008 (UTC)
- I disagree that modern computers are fast enough or have large enough memories for the use of linear transformations to be deprecated. This makes the assumption that the user has no time limit for computation or that the user is working on a data set that can fit in the memory of a computer. When working with very large data sets, as often done in physics experiments and simulations, data can exceed several terabytes. Also keep in mind the Fourier and Laplace transforms are also linear transformation. In engineering, applying these linear transform before curve fitting is the rule rather than the exception. —Preceding unsigned comment added by 63.201.67.93 (talk) 09:50, 3 August 2008 (UTC)
Is this a known problem in multiple non-linear regression?
I'm referring to linearly adding non-linear effects. For example:
f(x1,x2) = k + a log(x1) + b log(x2)
Now, suppose explanatory variables x1 and x2 are such that they have identical effects, with all else being equal. Suppose also that the quantities are such that they can be added (e.g. concentration of a greenhouse gas.) Then you have that a = b, so:
f(x1,x2) = k + a log(x1 + x2)
But then:
log(x1 + x2) = log(x1) + log(x2)
Which is absurd. So it seems to me that non-linear effects can't really be added linearly. I was just wondering if this is a known problem of multiple non-linear regression analysis. Joseph449008 (talk) 14:18, 31 December 2009 (UTC)
- Your algebra is wrong. If a = b, then
- a log(x1) + b log(x2) = a( log(x1) + log(x2) ).
- There is nothing in that, that in any way implies that this is equal to
- a log(x1 + x2).
- How you reached that conclusion you haven't said. Michael Hardy (talk) 21:22, 31 December 2009 (UTC)
- I did not manage to explain myself then. It's a thought experiment. Suppose x1 and x2 are variables that have identical effects. Specifically, suppose they are greenhouse gas concentrations. To make it more obvious, suppose x1 is industrially produced CO2 and x2 is naturally produced CO2. Clearly, you can add x1+x2 and you get 'all CO2'. So the effect of the two combined can be expressed as k + a log (x1 + x2). Does that make sense? Joseph449008 (talk) 23:44, 31 December 2009 (UTC)
- Or to explain it a different way, if x1 and x2 are interchangeable, and we accept that the non-linear effects can be added linearly, then the following expression should be valid, but isn't:
- k + a log(100) + b log(100) = k + a log(1) + b log(199)
- Joseph449008 (talk) 00:19, 1 January 2010 (UTC)
Michaelis–Menten model is a terrible example
Despite being a biochemist, I have a problem with the Michaelis–Menten model being used as an example. The Michaelis–Menten equation is a specific case of rectangular hyperbola and has its own jargon which is quite different from the standard mathematical jargon: this article is about a mathematical topic and not enzyme kinetics. Secondly, the example image given has a perfect fit which is a very poor example of regression. --Squidonius (talk) 00:31, 2 October 2011 (UTC)
:: Also, that looks a lot like polynomial linear regression. In no way can anyone guess it is non-linear regression from a simple viewing of the image. NK (talk) 16:39, 18 July 2018 (UTC)
Equation parsing error
There is a problem with the last equation under the part "Regression Statistics". My web browser cannot convert it to PNG. I tried to fix it myself, but I failed. — Preceding unsigned comment added by Gustafullman (talk • contribs) 10:49, 7 October 2011 (UTC)
- Thanks for reporting that. I've fixed it by adding some more braces {} to the LaTeX source for the equation. Appears this was due to a bug associated with the recent upgrade to the MediaWiki software. Qwfp (talk) 14:07, 7 October 2011 (UTC)
Grammar in opening paragraph
"... observational data are modeled by a function which [sic.] is a nonlinear combination of the model parameters and depends on one or more independent variables." What this actually means, in plain English, is that the model of the observational data is a function of its own parameters and depends on some variables; i.e. the phrase is meaningless. If we cannot write an introductory sentence that means anything then it would be less confusing if we were to remove it altogether. At the moment, all it is actually saying is: "In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function"; maybe we ought to leave it at that and forget about the ensuing word salad? (by 90.217.127.222)
- You're right that the sentence is very weak. However, your two "plain English" translations have omitted the crucial word from the original: "nonlinear". People often have very strong opinions on article intros. So shall we draft a better sentence here, on the talk page, before changing the article?
- Also, please sign your talk page posts using four tildes: ~~~~. Mgnbar (talk) 16:44, 4 May 2014 (UTC)
2017 December
Please forgive me, but I do not like this at all. Do you think somebody reading this is really going to get it and say "ah ha, got it" and go solve his problem with C#, visual basic or excel? I really do not think so. and what's with that "Regression statistics" section? It's a first order Taylor series expansion. Why doesn't it say that? — Preceding unsigned comment added by 64.207.224.40 (talk) 17:15, 14 December 2017 (UTC)
- It's not a great article. Feel free to add the part about Taylor series, although another editor may complain that it is unsourced.
- Part of the problem is that Wikipedia has many articles on closely related topics around regression and least squares. Did you follow the link to Non-linear least squares for more detail?
- Another part of the problem is that Wikipedia:Wikipedia is not a textbook. Another part of the problem, I think, it that you are underestimating the difficulty --- if you really want to write your own non-linear optimizer in Excel, based on the kind of rudimentary introduction that a Wikipedia article provides. Regards. Mgnbar (talk) 19:10, 14 December 2017 (UTC)
- I added the note on Taylor series. In that section, is "this procedure" referring to a particular numerical approximation technique? AFAIK, this isn't an approximation intrinsic to non-linear regression. Scientific29 (talk) 05:16, 5 January 2018 (UTC)