Talk:Logistic regression: Difference between revisions

Content deleted Content added

Inline

Revision as of 09:03, 2 February 2010

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
???	This article has not yet received a rating on Wikipedia's content assessment scale.
???	This article has not yet received a rating on the importance scale.

proposed merge

I think this article should be mergedb with logit Pdbailey 03:19, 20 April 2006 (UTC)[reply]

Interesting question: the logit link function is the inverse of the logistic function, which also has its own article (that talks about epidemiology, etc.). However, I think the inverse function is only really used in logistic regression, so the merge does make sense. -- hike395 14:31, 20 April 2006 (UTC)[reply]

NO! This suggestion is totally wrong! logit and logistic and logistic regression are different things,vvggghh and should not be mixed up merely because they have close relations. For example, I was initially interested in logistic function, then thought logit can be used for other purpose, but not the logistic regression. And I don't need to know the logistric regression at all to use logistic function. --Pren 14:43, 24 April 2006 (UTC)[reply]

Pren, thanks for the input. Can you please expand on why you think the two should not be merged. I'm specifically interested in what purpose you had for using the logit function that was not associated with logistic regression. Thanks a lot for including your input. Pdbailey 16:45, 24 April 2006 (UTC)[reply]

I agree with Pren. The logit function is just a function, easily described by a formula. Logistic regression is a mathematical procedure in applied mathematics, that makes use of the logit function. Of course both articles should link to each other, but they are different things, objects of a different category and complexity. The logit function is interesting in itself. --zeycus 11:25, 25 April 2006 (UTC)[reply]

Pren and Zeycus, have you read the logit entry and the wikipedia page on [[wp:mm|merges]? It looks to me like these pages meet the second or third criteria for merging, which are

'There are two or more pages on related subjects that have a large overlap. Wikipedia is not a dictionary; there doesn't need to be a separate entry for every concept in the universe. For example, "Flammable" and "Non-flammable" can both be explained in an article on Flammability.
'If a page is very short and cannot or should not be expanded terribly much, it often makes sense to merge it with a page on a broader topic.'

Certainly the portion of the logit page that is on logistic regression can be merged with this page and then removed from that page. Once that is done, the logit page is one paragraph long and is either a stub or should be deleted. This raises the question, are we then making the wikipedia a dictionary by including it? If there is some real substantive material about logit that does not fold well into other areas, probably not -- it should probably stay. But I don't think the logit is as important a function as, say, the gamma function. I take as evidence of its unimportance that it does not appear in "Handbook of Mathematical Functions." Just saying that it is a function distinct from the regression that it is often used for does not seem sufficient. Pdbailey 14:42, 25 April 2006 (UTC)[reply]

Thank you, Pdbaley, I read what you suggested and I see your point. So now the question seems subjective to me, I don't dare to defend any of the options. --zeycus 16:41, 26 April 2006 (UTC)[reply]

I tend towards inclusionism -- I recommend that we delete the overlapping part of logit, but then leave the rest alone: it may expand in the future to include history or other applications, who knows? -- hike395 17:31, 26 April 2006 (UTC)[reply]

Hike395, -isms aside, can you identify what is included in logit that should not be included in this page? I'm not sure I see anything. Pdbailey 04:59, 27 April 2006 (UTC)[reply]

How about

In mathematics, especially as applied in statistics, the logit (pronounced with a long "o" and a soft "g", IPA /loʊdʒɪt/) of a number p between 0 and 1 is

{\rm {logit}}(p)=\log \left({\frac {p}{1-p}}\right)=\log(p)-\log(1-p).

Plot of logit in the range 0 to 1, base is e

The logit function is the inverse of the "sigmoid", or "logistic" function. If p is a probability then p/(1 − p) is the corresponding odds, and the logit of the probability is the logarithm of the odds; similarly the difference between the logits of two probabilities is the logarithm of the odds-ratio, thus providing an additive mechanism for combining odds-ratios.

-- hike395 05:40, 27 April 2006 (UTC)[reply]

Okay, I can see why (given your wikipedia philosophy) you want to keep that article and I think it's just subjective at this point. I'll just say now that short of another voice we can use your proposed text. That said, let me tell you why I disagree that there is value in having that article separate. I think that it might make new users think that it is all wikipedia has to say on logistic regression. Afterall, search logit on google and you get that page. If you still disagree, again, I'll concede the point and I'll update both articles after a few days for others to throw in their two cents. Pdbailey 14:20, 27 April 2006 (UTC)[reply]

Yes, I can see your point, it's valid. How about we append

The logit function is an important part of logistic regression: for more information, please see that article.

Would that take care of your objection? -- hike395 03:33, 28 April 2006 (UTC)[reply]

I am reading 'Gatrell, A.C. (2002) Geographies of Health: an Introduction, Oxford: Blackwell.' today and it discusses 'logistic regression model' in a health geography context of case and controls. It appears to be mentioned in alot of academic literature, why is wikipedia trying to call it something else?Supposed 19:22, 9 May 2006 (UTC)[reply]

This article would be more useful if an example could be given of how logistic regression is used in statistical analysis. For example, it would be great if someone could use actual data to describe how logistic regression makes X concept more clear. Zminer 01:58, 15 May 2006 (UTC)[reply]

I know it is a long time since this issue was discussed, but if I may, can I say that I am very happy that this page was not merged with the logistic function. I specifically searched for 'logit' before I found out that it was the inverse of the logistic function, as I needed a basic knowledge for my PhD viva. I'm pleased to say that I passed and can attribute some of my useful revision to WP. I have since contributed to the page myself. This is a nice example of what Wikipedia is about - accessible, useful knowledge that we can all build upon. Thanks guys. Davwillev 15:54, 25 July 2007 (UTC)[reply]

A happy middle ground might be to include some more background material on logistic regression. What about why it is used, when it is used? It would not help me at all to have it merged with another (to me) obscure statistical term, rather it would help me to develop the page so I can understand it! SallaCT 12:47, 18 August 2007 (UTC)[reply]

Actually, this is a sad middle ground because the text you are interested in isn't present. The fact that two highly related terms aren't merged undoubtedly contributed to that. Pdbailey 22:24, 19 August 2007 (UTC)[reply]

decision

no merge was performed, it's been quite a while since this was open. Pdbailey 03:56, 21 August 2007 (UTC)[reply]

Mistake?

What does i, = 1, ..., n mean? What the comma after i stands for? -- Neoforma 12:54, 13 July 2006 (UTC)[reply]

Was just about to answer the wrong question. Yup, that's a mistake. — cBuckley (Talk • Contribs) 17:42, 13 July 2006 (UTC)[reply]

binomial distributed errors?

Since when does the logit model have binomial distributed errors? This must be standard (with mean 0 and s=1) logistic distributed.

I improved the wording of this part to be more accurate. Have a look. Baccyak4H (talk) 17:42, 22 November 2006 (UTC)[reply]

Along a similar line, the article read that the dependant variable was bernouli distributed, I updated this to bionomially distributed because the binomial is a generalization of the bernouli to more than one trial. Perhaps this further clarifies things. Pdbailey 01:39, 5 March 2007 (UTC)[reply]

I agree in principle that binomial is correct and more general than Bernoulli so in some sense preferred. However, the article refers to the Y_is equalling 1, which means the context here is considering any binomial Y as rather several Bernoulli Y. So I would leave the description as Bernoulli, unless the math notation were rewrote to reflect binomial. And come to think of it, how would one even do that? Baccyak4H (Yak!) 17:54, 11 June 2007 (UTC)[reply]

Baccyak4H, (1) I think we should be as general as possible, for now, lets just note that the example is worked for a specific case of the Binomial distribution. (2) if you don't know how, I'd suggest that you read Generalized Linear Models by McCullagh and Nelder (1989), table 2.1 on page 30. This shows how the binomial fits in the exponential family from, which allows for the fitting technique used in the book. BTW, I don't think I just did an RV on this page, but if I did, I'm sorry and you can undo it pending the conclusion of this conversation. Pdbailey 02:23, 12 June 2007 (UTC)[reply]

You did mention elsewhere the page could use a rewrite :). Without even looking at M&N, just thinking about GLMs made me realize one could reformulate in terms of expectations of the binomial, as in E(Y_i)/n_i, rather than probabilities. The article still needs work though.

I would note that the binomial/Bernoulli debate has gone back and forth in the article history. In a perfect world, it would stay binomial, but in that world the text would be consistent with binomial too. Let's see what we can do. Baccyak4H (Yak!) 13:14, 12 June 2007 (UTC)[reply]

rewrite tag

This article is a series of barely strung together thoughts, most of which are half there. Most of which probably should be or are already done better in another page. As an example, the concept of a link function and interpretation is covered much better in Generalized_linear_model. The applications section comes second and doesn't ever explain what "lift" is, but it appears to be the effect of the link function. Why is it surprising that a link has an effect? Why include this? Why havemore than one link to GLM? I could go on. Pdbailey 18:29, 29 May 2007 (UTC)[reply]

We both have put some good work in, and it does read better. The big thing in my eyes is that the example is not of logistic regression but rather just of a calculation of odds. It could use a better example. Baccyak4H (Yak!) 17:42, 14 June 2007 (UTC)[reply]

Remove Jarrow Turnbull model?

Is there a reason to include the Jarrow Turnbull Model section in this page? Is there a reason that the logistic regression has to be used for this model and not just a binomial regression in general? Would anyone object to removing this section and moving it to a "see also". Pdbailey 15:29, 13 June 2007 (UTC)[reply]

I am not familiar with that model in general. Its article reads even poorer than this one does, so it is little help for me. If it is really sometimes done in ways which are not strictly logistic regression (e.g., other links), but are all analysis of binomial (Bernoulli) defaults, then go ahead and move it. I am going to have a look at the overview section... Baccyak4H (Yak!) 15:46, 13 June 2007 (UTC)[reply]

Move it where? Pdbailey 16:54, 13 June 2007 (UTC)[reply]

Sorry, meant remove to "see also", which you did. It's starting to look a lot better... Baccyak4H (Yak!) 17:04, 13 June 2007 (UTC)[reply]

While it's mentioned that the beta-coefficients can be obtained via maximum likelihood estimation (ostensibly by taking the log-likelihood function and then taking derivatives with respect to the coefficients), how about actually writing up a simple example for obtaining the coefficients and values for p? Fully-solved examples are remarkably helpful to us neophytes.

sympathy for the novice?

This page is utterly incomprehensible for the novice who just wants a basic idea of what logistic regression analysis *does*. The rigorous math is fine but before diving into it it would be nice to give a more comprehensible introduction and maybe a real world example that might illuminate the topic a bit.

The point above is extremely relevant. Most people do not have a firm understanding of Applied Mathmatics or Statistics in general. Quite a surprise that none of the contributing authors has ventured into making their knowledge understandable for the lay person. The ability to teach or communicate concepts to others is a distinction between an expert and an apprentice. Johnbushiii 18:41, 20 August 2007 (UTC)[reply]

I think they must be long gone, and this page, like so many of the GLM related pages has almost no editors. Any change requires a huge amount of thought to get things going in the right diredtion and to hang together. It might be just beyond wikipedia to support these articles. Pdbailey 03:36, 21 August 2007 (UTC)[reply]

I agree, although point out that it is hard to find third party references to it (second party is easy, journals and the like, but that's different). Perhaps Ed Tufte's proposal of rethinking the O-ring data leading up the the Challenger disaster might be a good start. Let me look it up. Baccyak4H (Yak!) 18:02, 13 September 2007 (UTC)[reply]

Hmm, no, that example was not logistic regression (although it could be if I did some OR). I hope to take another look or two to improve the article. Baccyak4H (Yak!) 18:18, 13 September 2007 (UTC)[reply]

Wikipedia's statistics articles are usually excellent. This is the poorest one I've seen. It sounds like it was written by a student who had just learned the concept formally, and didn't really understand it yet. 131.107.0.73 23:03, 15 November 2007 (UTC)[reply]

Generally I've found that statistics articles not saying very much (although a few of them do) and consequently incoprehensible, in contrast to math articles generally, which explicitly define the concepts they're about and consequently are comprehensible (except when they're on a topic in some area of math in which you don't know the basic definitions). Michael Hardy 23:21, 15 November 2007 (UTC)[reply]

This is my suggestion for a re-written introduction: 1. Regression models are a group of statistical methods to describe the relationship between multiple risk factors and an outcome. 2. Linear regression is a type of regression model that is used when the outcome is binary or dichotomous (that is, the outcome can only take one of two possible values, like lived/died or failed/succeeded).

This clearly explains what logistic regression is commonly used for, and tells the reader briefly when it is used. The current introduction simply does not provide enough context for the lay reader. We could also add a section at the end for links to chapters describing other regression models like linear regression. --Gak (talk) 02:02, 16 December 2007 (UTC)[reply]

Logistic regression for the layman

Here follows my proposed explanation for the layman. I will post this on 6 Feb if there are no revisions or objections.

Figure 1. The logistic function, with z on the horizontal axis and *f(z)* on the vertical axis.

An explanation of logistic regression begins with an explanation of the logistic function:

f(z)={\frac {1}{1+e^{-z}}}

A graph of the function is shown in figure 1. The "input" is z and the "output" is f(z). The logistic function is useful because it can take as an input, any value from negative infinity to positive infinity, whereas the output is confined to values between 0 and 1. The variable, z represents the exposure to some set of risk factors, while f(z) represents the probability of a particular outcome, given that set of risk factors. The variable z is a measure of the total contribution of all the risk factors used in the model.

The variable z is usually defined as

z=\beta _{0}+\beta _{1}x_{1}+\beta _{2}x_{2}+\beta _{3}x_{3}+\cdots +\beta _{k}x_{k},

where $\beta _{0}$ is called the "intercept" and $\beta _{1}$ , $\beta _{2}$ , $\beta _{3}$ , and so on, are called the "regression coefficients" of $x_{1}$ , $x_{2}$ , $x_{3}$ respectively. The intercept is the value of z when the value of all the other risk factors is zero (i.e., the value of z in someone with no risk factors). Each of the regression coefficients describes the size of the contribution of that risk factor. A positive regression coefficient means that that risk factor increases the probability of the outcome, while a negative regression coefficient means that that risk factor decreases the probability of that outcome; a large regression coefficient means that that risk factor strongly influences the probability of that outcome; while a near-zero regression coefficient means that that risk factor has little influence on the probability of that outcome.

Logistic regression is a useful way of describing the relationship between one or more risk factors (e.g., age, sex, etc.) and an outcome such as death (which only takes two possible values: dead or not dead).

The application of a logistic regression may be illustrated using a fictitious example of death from heart disease. This simplified model uses only three risk factors (age, sex and cholesterol) to predict the 10-year risk of death from heart disease. This is the model that we fit:

\beta _{0}=-5.0{\text{ (the intercept)}}

\beta _{1}=+2.0

\beta _{2}=-1.0

\beta _{3}=+1.2

x_{1}={\text{ age in decades}}

x_{2}={\text{ sex, where 0 is male and 1 is female}}

x_{3}={\text{ cholesterol level, in mmol/dl less 5.0}}

Which means the model is

{\text{Risk of death}}={\frac {1}{1+e^{-z}}}{\text{, where }}z=-12.0+2.0x_{1}-1.0x_{2}+1.2x_{3}

In this model, increasing age is associated with an increasing risk of death from heart disease (z goes up by 2.0 for every 10 years over the age of 50), female sex is associated with a decreased risk of death from heart disease (z goes down by 1.0 if the patient is female) and increasing cholesterol is associated with an increasing risk of death (z goes up by 0.2 for each 1 mmol/dl increase in cholesterol).

We wish to use this model to predict Mr Petrelli's risk of death from heart disease: he is 50-years-old and his cholesterol level is 7.0 mmol/dl. Mr Petrelli's risk of death therefore $={\frac {1}{1+e^{-z}}}{\text{, where }}z=-5.0+(+2.0)(5.0-5.0)+(-1.0)0+(+1.2)(7.0-5.0).$

Which means that by this model, Mr Petrelli's risk of dying from heart disease in the next 10 years is 0.07 (or 7%). --Gak (talk) 06:49, 1 February 2008 (UTC)[reply]

Old example section removed because there is already an example given in the new layman's section.

The old example section is reproduced here:

Let p(x) be the probability of success when the value of the predictor variable is x. Then let

p(x)={\frac {1}{1+e^{-(B_{0}+B_{1}x)}}}={\frac {e^{B_{0}+B_{1}x}}{1+e^{B_{0}+B_{1}x}}}.

Algebraic manipulation shows that

{\frac {p(x)}{1-p(x)}}=e^{B_{0}+B_{1}x},

where ${\frac {p(x)}{1-p(x)}}$ is the odds in favor of success. If we take, say p(50) = 2/3, then

{\frac {p(50)}{1-p(50)}}={\frac {\frac {2}{3}}{1-{\frac {2}{3}}}}=2.

So when x = 50, a success is twice as likely as a failure. Or, it can be simply said that the odds are 2 to 1.

--Gak (talk) 01:12, 12 February 2008 (UTC)[reply]

One request: It seems this section was recently taken out. As a student trying to grasp this statistic technique, I found this section to be one of the best lay explanations I had read in statistics. The example offered an intuitive way to help grasp the material. The outline for setting up the model for the variable z and the example that followed was well done. While the formal mathematical definition should always be included I guarantee that most of the people that visited this page in the past got what they needed in the lay explanation section. It should somehow be included again in the main page with as close to the wording above as possible. Cgall (talk) 17:22, 24 September 2008 (UTC)cgall[reply]

For the layman???

You MUST be joking. I don't think I am a dolt. However I am not a mathematician nor a statistician; I am a professional translator (also a linguist and also a contributor to Wikipedia but in language-related articles and such). I looked up this article today because I NEED to know, in a very basic LAYMAN's sort of way, what logistic regression is, what it is about, and ideally (for my purposes) an intelligible explanation of how it works which provides a model of the language that ought to be used when explaining this to someone. That would help me to get my language right in my translation, where I need to translate just such an explanation, in one short paragraph, that forms part of a 170-page report written for a readership that is not expected to know anything about statistics. While that is just what I need from this article, it is also roughly what I expect to find in such an article and roughly what I believe would be expected and found useful by many other Wikipedia users. This fails totally to provide any of that. It is useless to me. Wikipedia has helped me out time and time again which is why I consider it one of my most valuable tools for work. But it wouldn't be if all articles were like this one. I'm sorry to be so negative, but you really do need to get your act together. --A R King (talk) 17:23, 2 March 2008 (UTC)[reply]

To be a little bit less negative and try to help some of you guys out there come down to earth, I though it might be useful if I gave you a snippet of the article I'm working on in the English translation I've done (which may be improvable), so you can see how many lightyears distance separates one kind of discourse from another:

Logistic regression analysis is a technique for identifying the variables that best predict a given event or situation according to a model or equation produced by the analysis itself. In the present case, we used this analysis to find the variables that best determine the occurrence or non-occurrence of certain levels of Basque language use among pupils. This kind of analysis has one strict condition: the variable that is to be predicted must be dichotomic, that is, there can only be two possible values, such as A-or-B, yes-or-no, etc....

--A R King (talk) 17:32, 2 March 2008 (UTC)[reply]

Your translation sounds fine apart from one word: "dichotomic" should be "dichotomous". I think your problem with this section could be eased simply by renaming it. "The layman" probably wouldn't want to follow this level of maths, even though there isn't anything there beyond secondary school level. The level of explanation you're after should be present in the lead section of the article. Perhaps this could be improved, but the first sentence of this article is clearer and more informative than the first sentence of your passage (for which I'm blaming its author, not its translator). Qwfp (talk) 19:38, 2 March 2008 (UTC)[reply]

Rewriting for readability

I'm glad to see there's been previous discussion of the readability of this article, and some suggestions. I think there are improvements to make. I like the general approach of explaining what logistic regression is useful for, first.

Also I have problems with the current exposition, which starts off in the first sentence with "logistic regression is a model used for prediction of the probability of occurrence of an event by fitting data to a logistic curve." I think that is not useful, there is no way to explain simply how fitting a logit model is like fitting data to a logistic curve. To speak of curve-fitting as done here would be appropriate if you are literally doing curve-fitting that can be visualized. For example curve-fitting is an exercise of finding parameters of a specific curve that best fits to specified data, in the same way that an ordinary least squares regression is the straight line closest-fitting (by measure of sum of squared deviations) to data that can be plotted. Logistic regression, instead, is a maximum likelihood technique, and there are no obvious curves in plots of data to be fit by a logit model, and it would be very hard to convey how logit regression is curve-fitting.

I have added a small public domain dataset to FitzPatrick 1932 article and may use that in demonstrating logit regression here and/or in a bankruptcy prediction article which i am developing. As this develops, comments would be welcomed. doncram (talk) 17:56, 5 September 2008 (UTC)[reply]

How to estimate parameters?

There's nothing in this article about how to actually do logistic regression, i.e., estimate the parameters, except one sentence ""The unknown parameters β_j are usually estimated by maximum likelihood". This is a pretty huge omission. Surely ought to add a section on how to do this. It would probably describe minimizing the cross-entropy function derived from the likelihood function, as in, e.g., section 6.7 of [Christopher M. Bishop, "Neural Networks for Pattern Recognition"]. RVS (talk) 19:44, 26 January 2009 (UTC)[reply]

All generalized linear models are fit in (approximately) the same way. There is one speedup available for logit, but agreed that we should mention that this is where this information is. PDBailey (talk) 00:21, 27 January 2009 (UTC)[reply]

The generalized linear models page only says "The unknown parameters, β, are typically estimated with maximum likelihood, maximum quasi-likelihood, or Bayesian techniques." RVS (talk) 20:00, 30 January 2009 (UTC)[reply]

The method is described in detail in Chapter 2 of McCullagh & Nelder if you would like to add it, I think it would be a great thing to add since GLM are basically similar because of the unified fitting technique. PDBailey (talk) 00:19, 31 January 2009 (UTC)[reply]

It could be mentioned that parameter estimation is relatively easy by Newton-Raphson or any other search method, because the loglikelihood function is globally concave. There's a footnote source on the global concavity that i could dig up. So there is no possibility of getting trapped in a local maximum. I think the Newton-Raphson approach is different than the McCullagh & Nelder-described method, not sure what that is, and I am curious what is the speedup available for logit as opposed to other GLM models, by the way. Probably it is the McCullagh & Nelder approach that is implemented in Splus software, which fails to converge for certain datasets, from my experience. doncram (talk) 03:03, 31 January 2009 (UTC)[reply]

Doncram, the method is equivalent to Newton-Raphson with a method of approximating the Jacobian. It is globally concave only if X is full rank, this tends to be the problem with non-convergent solutions in R (and S plus?). PDBailey (talk) 15:34, 31 January 2009 (UTC)[reply]

I recently found code for a Matlab/Octave function that seems to work well for estimating the parameters of a multiple logistic regression. It uses Newton-Raphson iteration and takes about fifteen lines of code to do the job. It is in the public domain as it was written at one of the National Labs. I also have a write-up of the methods used including the actual maximum likelihood computation. It uses a gradient and the Hessian to compute the successive approximations of the logistic regression weights. The discussions of the logit I have found in the literature mostly do not suggest how to proceed to estimate the parameters, and I think it would be a service to clarify how that actually works. I have the write-up in open office. Do you guys know how to transform that to a Wiki page? Pseudopigraphia (talk) 02:57, 11 May 2009 (UTC)[reply]

As long as it is public domain, you could just copy / write it into a temporary working page as a subpage of this Talk page, say to Talk:Logistic regression/Estimation, and we could try working on it there. I think it might be unusual to include code written in one programming language in a general article about a statistical method, but I for one would be interested in seeing if we could do that well. doncram (talk) 05:07, 11 May 2009 (UTC)[reply]

There are many other wiki articles that have examples in particular programming languages, so the precedent is well established. Python is used frequently as is Matlab/Octave. Both Octave and Python are freely available open source language projects that run on all major operating systems, so I would not be hesitant to publish algorithms in them. I will post to a test page when I figure out the formatting. Pseudopigraphia (talk) 19:51, 11 May 2009 (UTC)[reply]

I notice that as of 2009/6/24, the generalized linear models page now has a section on "Fitting", so I'd say this is pretty well covered now. I have nothing against adding more detail on the algorithm, though. RVS (talk) 22:18, 1 September 2009 (UTC)[reply]

Connection to Support Vector Machines and Adaboost

Support Vector Machines and Adaboost are only slight variations on the idea of logistic regression. They fit nicely into the logistic regression framework, and this is a very enlightening/easy way to view them. However, I don't understand this point of view well yet myself. I think this would be good to include in this article, given the importance of SVMs and Adaboost.Singularitarian (talk) 09:03, 2 February 2010 (UTC)[reply]

@@ Line 190: / Line 190: @@
 ::: There are many other wiki articles that have examples in particular programming languages, so the precedent is well established.  Python is used frequently as is Matlab/Octave.  Both Octave and Python are freely available open source language projects that run on all major operating systems, so I would not be hesitant to publish algorithms in them.  I will post to a test page when I figure out the formatting.  [[User:Pseudopigraphia|Pseudopigraphia]] ([[User talk:Pseudopigraphia|talk]]) 19:51, 11 May 2009 (UTC)
 :::I notice that as of 2009/6/24, the generalized linear models page now has a section on "Fitting", so I'd say this is pretty well covered now. I have nothing against adding more detail on the algorithm, though. [[User:RVS|RVS]] ([[User talk:RVS|talk]]) 22:18, 1 September 2009 (UTC)
+==Connection to Support Vector Machines and Adaboost==
+Support Vector Machines and Adaboost are only slight variations on the idea of logistic regression.  They fit nicely into the logistic regression framework, and this is a very enlightening/easy way to view them.  However, I don't understand this point of view well yet myself.  I think this would be good to include in this article, given the importance of SVMs and Adaboost.[[User:Singularitarian|Singularitarian]] ([[User talk:Singularitarian|talk]]) 09:03, 2 February 2010 (UTC)