Talk:Least-squares support vector machine: Difference between revisions

Content deleted Content added

Inline

Revision as of 14:53, 8 July 2016

Statistics Start‑class

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
Start	This article has been rated as Start-class on Wikipedia's content assessment scale.
???	This article has not yet received a rating on the importance scale.

LS-SVM is really just Kernel Ridge Regression (a.k.a. Kernel Regularized Least Squares)

See, for example, Rifkin et al., "Regularized Least-Squares Classification", NATO-CSS 2003. Shouldn't this be mentioned? Also, the "support vector" in the name is misleading, since there is no concept of a support vector -- this algorithm doesn't induce sparsity, like the SVM does. Jotaf (talk) 19:24, 5 November 2012 (UTC)[reply]

Indeed, ordinary LS-SVM lack the notion of support vectors altogether, but it was derived from the SVM optization problem so the name stuck. thisbugisonfire (talk) 14:53, 8 July 2016 (UTC)[reply]

?

Quelques remarques :

lien Least Squares Support Vector Machine dans atelier-projet faux (manque des tirets)
lien kernel matrix faux : est-ce kernel_(matrix) ?

Illustrations do not seem to go with this discussion

I believe the pictures fail to illustrate the difference between LS-SVM and SVM. Here is why: First - SVM provides the same results as LS-SVM, when the data is separable. That is both:

$\xi _{i}=0,i=1,\ldots ,N$

and

$e_{c,i}=0,i=1,\ldots ,N$

Then both the objective functions of the L1 and L2 versions become the same. However, the illustrations appear to show the SVM separating the two classes perfectly. There should not be a difference if the same kernel that perfectly separated the example is used (no classification error). Also the LS-SVM appears to misclassify half of the points.

To demonstrate the difference between these two methods, a linear kernel should be used with data that is not linearly separable. Then give two examples: One with large outliers, where LS-SVM provides more relative weight to those outliers than SVM. Then give an example where outliers are 1.0 algebraic distance away. SVM and LS-SVM ought to produce nearly identical results when there are only misclassified samples with an error of 1.0 and $1^{2}=1$ .

@@ Line 5: / Line 5: @@
 See, for example, Rifkin et al., "Regularized Least-Squares Classification", NATO-CSS 2003. Shouldn't this be mentioned? Also, the "support vector" in the name is misleading, since there is no concept of a support vector -- this algorithm doesn't induce sparsity, like the SVM does.
 [[User:Jotaf|Jotaf]] ([[User talk:Jotaf|talk]]) 19:24, 5 November 2012 (UTC)
+Indeed, ordinary LS-SVM lack the notion of support vectors altogether, but it was derived from the SVM optization problem so the name stuck.
+[[User:Thisbugisonfire|thisbugisonfire]] ([[User talk:Thisbugisonfire|talk]]) 14:53, 8 July 2016 (UTC)
 == ? ==