Scikit-learn: Difference between revisions

scikit-learn
	File:Scikit-learn logo.png
Original author(s)	David Cournapeau
Initial release	June 2007; 17 years ago
Stable release	0.15.2 / September 4, 2014; 9 years ago
Repository	github.com/scikit-learn/scikit-learn ;
Written in	Python, Cython, C and C++
Operating system	Linux, Mac OS X, Microsoft Windows
Type	Library for machine learning
License	BSD License
Website	scikit-learn.org

Browse history interactively

← Previous edit Next edit →

Content deleted Content added

VisualWikitext

Inline

Revision as of 15:54, 23 March 2015

scikit-learn (formerly scikits.learn) is an open source machine learning library for the Python programming language.^[2] It features various classification, regression and clustering algorithms including support vector machines, logistic regression, naive Bayes, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

Overview

The scikit-learn project started as scikits.learn, a Google Summer of Code project by David Cournapeau. Its name stems from the notion that it is a "SciKit" (SciPy Toolkit), a separately-developed and distributed third-party extension to SciPy.^[3] The original codebase was later extensively rewritten by other developers. Of the various scikits, scikit-learn as well as scikit-image were described as "well-maintained and popular" in November 2012^[update].^[4]

As of 2015^[update], scikit-learn is under active development and is sponsored by INRIA and occasionally Google (through the Google Summer of Code).^[5] Among its users are Evernote, which uses the library to distinguish recipes from other user posts through a naive Bayes classifier,^[6] and Mendeley, which builds recommender systems from scikit-learn's SGD regression algorithm.^[7]

The scikit-learn API has been adopted by wise.io, who offer a proprietary implementation of random forests called wiseRF.^[8]^[9] wise.io's business partner Continuum IO claimed data throughput of up to 7.5 times that of scikit-learn's implementation;^[10] since then, the scikit-learn developers claim to have optimized their implementation to be competitive with wise.io's, except in terms of memory use.^[11]

Implementation

scikit-learn is largely written in Python, with some core algorithms written in Cython to achieve performance. Support vector machines are implemented by a Cython wrapper around LIBSVM; logistic regression and linear support vector machines by a similar wrapper around LIBLINEAR.

References

^ Andreas Müller. "scikit-learn 0.15.2". Python Package Index.
^ Fabian Pedregosa; Gaël Varoquaux; Alexandre Gramfort; Vincent Michel; Bertrand Thirion; Olivier Grisel; Mathieu Blondel; Peter Prettenhofer; Ron Weiss; Vincent Dubourg; Jake Vanderplas; Alexandre Passos; David Cournapeau (2011). "Scikit-learn: Machine Learning in Python". Journal of Machine Learning Research. 12: 2825–2830.
^ Dreijer, Janto. "scikit-learn".
^ Eli Bressert (2012). SciPy and NumPy: an overview for developers. O'Reilly. p. 43.
^ "About Us". http://scikit-learn.org. Retrieved 23 March 2015. {{cite web}}: External link in |publisher= (help)
^ Mark Ayzenshtat (22 January 2013). "Stay classified". Evernote Techblog. Retrieved 4 May 2013.
^ Mark Levy (2013). Efficient Top-N Recommendation by Linear Regression. ACM RecSys Large Scale Recommender System workshop.
^ "wiserf". wise.io. Retrieved 22 January 2014.
^ API design for machine learning software: experiences from the scikit-learn project. ECML PKDD Workshop on Languages for Machine Learning. 2013. {{cite conference}}: Cite uses deprecated parameter |authors= (help)
^ Joseph W. Richards (27 November 2012). "wiseRF Use Cases and Benchmarks". Continuum IO. Retrieved 22 January 2014.
^ Gaël Varoquaux (8 August 2013). "Scikit-learn 0.14 release: features and benchmarks". Retrieved 22 January 2014.

External links

[1] Andreas Müller. "scikit-learn 0.15.2". Python Package Index.

[jmlr-2] Fabian Pedregosa; Gaël Varoquaux; Alexandre Gramfort; Vincent Michel; Bertrand Thirion; Olivier Grisel; Mathieu Blondel; Peter Prettenhofer; Ron Weiss; Vincent Dubourg; Jake Vanderplas; Alexandre Passos; David Cournapeau (2011). "Scikit-learn: Machine Learning in Python". Journal of Machine Learning Research. 12: 2825–2830.

[3] Dreijer, Janto. "scikit-learn".

[4] Eli Bressert (2012). SciPy and NumPy: an overview for developers. O'Reilly. p. 43.

[5] "About Us". http://scikit-learn.org. Retrieved 23 March 2015. {{cite web}}: External link in |publisher= (help)

[6] Mark Ayzenshtat (22 January 2013). "Stay classified". Evernote Techblog. Retrieved 4 May 2013.

[7] Mark Levy (2013). Efficient Top-N Recommendation by Linear Regression. ACM RecSys Large Scale Recommender System workshop.

[8] "wiserf". wise.io. Retrieved 22 January 2014.

[9] API design for machine learning software: experiences from the scikit-learn project. ECML PKDD Workshop on Languages for Machine Learning. 2013. {{cite conference}}: Cite uses deprecated parameter |authors= (help)

[10] Joseph W. Richards (27 November 2012). "wiseRF Use Cases and Benchmarks". Continuum IO. Retrieved 22 January 2014.

[11] Gaël Varoquaux (8 August 2013). "Scikit-learn 0.14 release: features and benchmarks". Retrieved 22 January 2014.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

@@ Line 63: / Line 63: @@
 }}</ref>
-{{As of|2013}}, scikit-learn is under active development and is sponsored by [[INRIA]] and occasionally [[Google]] (through the Google Summer of Code).<ref>{{cite web|title=About Us|url=http://scikit-learn.org/0.13/about.html#funding|publisher=http://scikit-learn.org|accessdate=3 May 2013}}</ref>
+{{As of|2015}}, scikit-learn is under active development and is sponsored by [[INRIA]] and occasionally [[Google]] (through the Google Summer of Code).<ref>{{cite web|title=About Us|url=http://scikit-learn.org/0.13/about.html#funding|publisher=http://scikit-learn.org|accessdate=23 March 2015}}</ref>
 Among its users are [[Evernote]], which uses the library to distinguish recipes from other user posts through a naive Bayes classifier,<ref>{{cite web|title=Stay classified|author=Mark Ayzenshtat|date=22 January 2013|accessdate=4 May 2013|url=http://blog.evernote.com/tech/2013/01/22/stay-classified/|website=Evernote Techblog}}</ref>
 and [[Mendeley]], which builds [[recommender system]]s from scikit-learn's [[Stochastic gradient descent|SGD]] regression algorithm.<ref>{{cite conference |title=Efficient Top-N Recommendation by Linear Regression |url=http://www.slideshare.net/MarkLevy/efficient-slides |author=Mark Levy |year=2013 |conference=ACM RecSys Large Scale Recommender System workshop}}</ref>

Revision as of 15:54, 23 March 2015

Overview

Implementation

See also

References

External links