Timeline of machine learning: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
FrescoBot (talk | contribs)
m Bot: link syntax and minor changes
m Partial reference cleanup. Cleaned up using AutoEd, General formatting by script
Line 16: Line 16:
| 1980s || Rediscovery of [[backpropagation]] causes a resurgence in machine learning research.
| 1980s || Rediscovery of [[backpropagation]] causes a resurgence in machine learning research.
|-
|-
| 1990s || Work on machine learning shifts from a knowledge-driven approach to a data-driven approach. Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions{{snd}} or “learn”{{snd}} from the results.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref> [[Support vector machines]] (SVMs) and [[recurrent neural networks]] (RNNs) become popular.
| 1990s || Work on machine learning shifts from a knowledge-driven approach to a data-driven approach. Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions{{snd}} or "learn"{{snd}} from the results.<ref name="Marr">{{cite news|last1=Marr|first1=Bernard|title=A Short History of Machine Learning Every Manager Should Read|url=https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|work=Forbes|accessdate=28 Sep 2016}}</ref> [[Support vector machines]] (SVMs) and [[recurrent neural networks]] (RNNs) become popular.
|-
|-
| 2000s || [[Kernel methods]] grow in popularity,<ref>Hofmann, Thomas, Bernhard Schölkopf, and Alexander J. Smola. "Kernel methods in machine learning." The annals of statistics (2008): 1171–1220.</ref> and competitive machine learning becomes more widespread.<ref>{{cite web |first1=James |last1=Bennett |first2=Stan |last2=Lanning |title=The netflix prize |journal=Proceedings of KDD Cup and Workshop 2007 |date=2007 |url=https://www.cs.uic.edu/~liub/KDD-cup-2007/NetflixPrize-description.pdf}}</ref>
| 2000s || [[Kernel methods]] grow in popularity,<ref>{{cite journal |last1=Hofmann |first1=Thomas |first2=Bernhard |last2=Schölkopf |first3=Alexander J. |last3=Smola |title=Kernel methods in machine learning |journal=The Annals of Statistics |volume=36 |issue=3 |year=2008 |pages=1171–1220 |jstor=25464664}}</ref> and competitive machine learning becomes more widespread.<ref>{{cite web |first1=James |last1=Bennett |first2=Stan |last2=Lanning |title=The netflix prize |journal=Proceedings of KDD Cup and Workshop 2007 |date=2007 |url=https://www.cs.uic.edu/~liub/KDD-cup-2007/NetflixPrize-description.pdf}}</ref>
|-
|-
| 2010s || [[Deep learning]] becomes feasible, which leads to machine learning becoming integral to many widely used software services and applications.
| 2010s || [[Deep learning]] becomes feasible, which leads to machine learning becoming integral to many widely used software services and applications.
Line 32: Line 32:
! Year !! Event type !! Caption !! Event
! Year !! Event type !! Caption !! Event
|-
|-
| 1763 || Discovery || The Underpinnings of [[Bayes' theorem|Bayes' Theorem]] || [[Thomas Bayes]]'s work ''[[An Essay towards solving a Problem in the Doctrine of Chances]]'' is published two years after his death, having been amended and edited by a friend of Bayes, [[Richard Price]].<ref>{{cite journal|last1=Bayes|first1=Thomas|title=An Essay towards solving a Problem in the Doctrine of Chance|journal=Philosophical Transactions|date=1 January 1763|volume=53|pages=370–418|doi=10.1098/rstl.1763.0053|url=http://rstl.royalsocietypublishing.org/content/53/370.full.pdf|accessdate=15 June 2016}}</ref> The essay presents work which underpins [[Bayes theorem]].
| 1763 || Discovery || The Underpinnings of [[Bayes' theorem|Bayes' Theorem]] || [[Thomas Bayes]]'s work ''[[An Essay towards solving a Problem in the Doctrine of Chances]]'' is published two years after his death, having been amended and edited by a friend of Bayes, [[Richard Price]].<ref>{{cite journal|last1=Bayes|first1=Thomas|title=An Essay towards solving a Problem in the Doctrine of Chance|journal=Philosophical Transactions|date=1 January 1763|volume=53|pages=370–418|doi=10.1098/rstl.1763.0053|url=http://rstl.royalsocietypublishing.org/content/53/370.full.pdf|accessdate=15 June 2016|jstor=105741}}</ref> The essay presents work which underpins [[Bayes theorem]].
|-
|-
| 1805 || Discovery || Least Squares || [[Adrien-Marie Legendre]] describes the "méthode des moindres carrés", known in English as the [[least squares]] method.<ref>{{cite book|last1=Legendre|first1=Adrien-Marie|title=Nouvelles méthodes pour la détermination des orbites des comètes|date=1805|publisher=Firmin Didot|location=Paris|page=viii|url=https://books.google.com/books/about/Nouvelles_m%C3%A9thodes_pour_la_d%C3%A9terminati.html?id=FRcOAAAAQAAJ&redir_esc=y|accessdate=13 June 2016|language=French}}</ref> The least squares method is used widely in [[data fitting]].
| 1805 || Discovery || Least Squares || [[Adrien-Marie Legendre]] describes the "méthode des moindres carrés", known in English as the [[least squares]] method.<ref>{{cite book|last1=Legendre|first1=Adrien-Marie|title=Nouvelles méthodes pour la détermination des orbites des comètes|date=1805|publisher=Firmin Didot|location=Paris|page=viii|url=https://books.google.com/books/about/Nouvelles_m%C3%A9thodes_pour_la_d%C3%A9terminati.html?id=FRcOAAAAQAAJ&redir_esc=y|accessdate=13 June 2016|language=French}}</ref> The least squares method is used widely in [[data fitting]].
Line 38: Line 38:
| 1812 || || [[Bayes' theorem|Bayes' Theorem]] || [[Pierre-Simon Laplace]] publishes ''Théorie Analytique des Probabilités'', in which he expands upon the work of Bayes and defines what is now known as [[Bayes' Theorem]].<ref>{{cite web|last1=O'Connor|first1=J J|last2=Robertson|first2=E F|title=Pierre-Simon Laplace|url=http://www-history.mcs.st-and.ac.uk/Biographies/Laplace.html|publisher=School of Mathematics and Statistics, University of St Andrews, Scotland|accessdate=15 June 2016}}</ref>
| 1812 || || [[Bayes' theorem|Bayes' Theorem]] || [[Pierre-Simon Laplace]] publishes ''Théorie Analytique des Probabilités'', in which he expands upon the work of Bayes and defines what is now known as [[Bayes' Theorem]].<ref>{{cite web|last1=O'Connor|first1=J J|last2=Robertson|first2=E F|title=Pierre-Simon Laplace|url=http://www-history.mcs.st-and.ac.uk/Biographies/Laplace.html|publisher=School of Mathematics and Statistics, University of St Andrews, Scotland|accessdate=15 June 2016}}</ref>
|-
|-
| 1913 || Discovery || Markov Chains || [[Andrey Markov]] first describes techniques he used to analyse a poem. The techniques later become known as [[Markov chains]].<ref>{{cite journal|last1=Hayes|first1=Brian|title=First Links in the Markov Chain|url=http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain/|accessdate=15 June 2016|work=American Scientist|issue=March–April 2013|publisher=Sigma Xi, The Scientific Research Society|page=92|doi=10.1511/2013.101.1|quote=Delving into the text of Alexander Pushkin’s novel in verse Eugene Onegin, Markov spent hours sifting through patterns of vowels and consonants. On January 23, 1913, he summarized his findings in an address to the Imperial Academy of Sciences in St. Petersburg. His analysis did not alter the understanding or appreciation of Pushkin’s poem, but the technique he developed—now known as a Markov chain—extended the theory of probability in a new direction.|volume=101}}</ref>
| 1913 || Discovery || Markov Chains || [[Andrey Markov]] first describes techniques he used to analyse a poem. The techniques later become known as [[Markov chains]].<ref>{{cite journal|last1=Hayes|first1=Brian|title=First Links in the Markov Chain|url=http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain/|accessdate=15 June 2016|work=American Scientist|issue=March–April 2013|publisher=Sigma Xi, The Scientific Research Society|page=92|doi=10.1511/2013.101.1|quote=Delving into the text of Alexander Pushkin's novel in verse Eugene Onegin, Markov spent hours sifting through patterns of vowels and consonants. On January 23, 1913, he summarized his findings in an address to the Imperial Academy of Sciences in St. Petersburg. His analysis did not alter the understanding or appreciation of Pushkin's poem, but the technique he developed—now known as a Markov chain—extended the theory of probability in a new direction.|volume=101}}</ref>
|-
|-
| 1950 || || Turing's Learning Machine || [[Alan Turing]] proposes a 'learning machine' that could learn and become artificially intelligent. Turing's specific proposal foreshadows [[genetic algorithms]].<ref>{{cite journal|last1=Turing|first1=Alan|title=COMPUTING MACHINERY AND INTELLIGENCE|journal=MIND|date=October 1950|volume=59|issue=236|pages=433–460|doi=10.1093/mind/LIX.236.433|url=http://mind.oxfordjournals.org/content/LIX/236/433|accessdate=8 June 2016}}</ref>
| 1950 || || Turing's Learning Machine || [[Alan Turing]] proposes a 'learning machine' that could learn and become artificially intelligent. Turing's specific proposal foreshadows [[genetic algorithms]].<ref>{{cite journal|last1=Turing|first1=Alan|title=Computing Machinery and Intelligence|journal=Mind|date=October 1950|volume=59|issue=236|pages=433–460|doi=10.1093/mind/LIX.236.433|url=http://mind.oxfordjournals.org/content/LIX/236/433|accessdate=8 June 2016}}</ref>
|-
|-
| 1951 || || First Neural Network Machine || [[Marvin Minsky]] and Dean Edmonds build the first neural network machine, able to learn, the [[Stochastic neural analog reinforcement calculator|SNARC]].<ref>{{Harvnb|Crevier|1993|pp=34–35}} and {{Harvnb|Russell|Norvig|2003|p=17}}</ref>
| 1951 || || First Neural Network Machine || [[Marvin Minsky]] and Dean Edmonds build the first neural network machine, able to learn, the [[Stochastic neural analog reinforcement calculator|SNARC]].<ref>{{Harvnb|Crevier|1993|pp=34–35}} and {{Harvnb|Russell|Norvig|2003|p=17}}</ref>
Line 50: Line 50:
| 1963 || Achievement || Machines Playing Tic-Tac-Toe || [[Donald Michie]] creates a 'machine' consisting of 304 match boxes and beads, which uses [[reinforcement learning]] to play [[Tic-tac-toe]] (also known as noughts and crosses).<ref>{{cite web|last1=Child|first1=Oliver|title=Menace: the Machine Educable Noughts And Crosses Engine Read|url=http://chalkdustmagazine.com/features/menace-machine-educable-noughts-crosses-engine/#more-3326|website=Chalkdust Magazine |accessdate=16 Jan 2018}}</ref>
| 1963 || Achievement || Machines Playing Tic-Tac-Toe || [[Donald Michie]] creates a 'machine' consisting of 304 match boxes and beads, which uses [[reinforcement learning]] to play [[Tic-tac-toe]] (also known as noughts and crosses).<ref>{{cite web|last1=Child|first1=Oliver|title=Menace: the Machine Educable Noughts And Crosses Engine Read|url=http://chalkdustmagazine.com/features/menace-machine-educable-noughts-crosses-engine/#more-3326|website=Chalkdust Magazine |accessdate=16 Jan 2018}}</ref>
|-
|-
| 1967 || || Nearest Neighbor || The nearest neighbor algorithm was created, which is the start of basic pattern recognition. The algorithm was used to map routes.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref>
| 1967 || || Nearest Neighbor || The nearest neighbor algorithm was created, which is the start of basic pattern recognition. The algorithm was used to map routes.<ref name="Marr" />
|-
|-
| 1969 || || Limitations of Neural Networks || [[Marvin Minsky]] and [[Seymour Papert]] publish their book ''[[Perceptrons (book)|Perceptrons]]'', describing some of the limitations of perceptrons and neural networks. The interpretation that the book shows that neural networks are fundamentally limited is seen as a hindrance for research into neural networks.<ref>{{cite web|last1=Cohen|first1=Harvey|title=The Perceptron|url=http://harveycohen.net/image/perceptron.html|accessdate=5 June 2016}}</ref><ref>{{cite web|last1=Colner|first1=Robert|title=A brief history of machine learning|url=http://www.slideshare.net/bobcolner/a-brief-history-of-machine-learning|website=SlideShare|accessdate=5 June 2016}}</ref>
| 1969 || || Limitations of Neural Networks || [[Marvin Minsky]] and [[Seymour Papert]] publish their book ''[[Perceptrons (book)|Perceptrons]]'', describing some of the limitations of perceptrons and neural networks. The interpretation that the book shows that neural networks are fundamentally limited is seen as a hindrance for research into neural networks.<ref>{{cite web|last1=Cohen|first1=Harvey|title=The Perceptron|url=http://harveycohen.net/image/perceptron.html|accessdate=5 June 2016}}</ref><ref>{{cite web|last1=Colner|first1=Robert|title=A brief history of machine learning|url=http://www.slideshare.net/bobcolner/a-brief-history-of-machine-learning|website=SlideShare|accessdate=5 June 2016}}</ref>
|-
|-
| 1970 || || Automatic Differentation (Backpropagation) || [[Seppo Linnainmaa]] publishes the general method for automatic differentiation (AD) of discrete connected networks of nested differentiable functions.<ref name="lin1970">[[Seppo Linnainmaa]] (1970). The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master's Thesis (in Finnish), Univ. Helsinki, 6–7.</ref><ref name="lin1976">[[Seppo Linnainmaa]] (1976). Taylor expansion of the accumulated rounding error. BIT Numerical Mathematics, 16(2), 146–160.</ref> This corresponds to the modern version of backpropagation, but is not yet named as such.<ref name="grie2012">Griewank, Andreas (2012). Who Invented the Reverse Mode of Differentiation?. Optimization Stories, Documenta Matematica, Extra Volume ISMP (2012), 389–400.</ref><ref name="grie2008">Griewank, Andreas and Walther, A.. Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008.</ref><ref name="schmidhuber2015">[[Jürgen Schmidhuber|Schmidhuber, Jürgen]] (2015). Deep learning in neural networks: An overview. Neural Networks 61 (2015): 85–117. [https://arxiv.org/abs/1404.7828 ArXiv]</ref><ref name="scholarpedia2015">[[Jürgen Schmidhuber|Schmidhuber, Jürgen]] (2015). Deep Learning. Scholarpedia, 10(11):32832. [http://www.scholarpedia.org/article/Deep_Learning#Backpropagation Section on Backpropagation]</ref>
| 1970 || || Automatic Differentation (Backpropagation) || [[Seppo Linnainmaa]] publishes the general method for automatic differentiation (AD) of discrete connected networks of nested differentiable functions.<ref name="lin1970">[[Seppo Linnainmaa]] (1970). "The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors." Master's Thesis (in Finnish), Univ. Helsinki, 6–7.</ref><ref name="lin1976">{{cite journal |first=Seppo |last=Linnainmaa |authorlink=Seppo Linnainmaa |year=1976 |title=Taylor expansion of the accumulated rounding error |journal=BIT Numerical Mathematics |volume=16 |issue=2 |pages=146–160 |doi=10.1007/BF01931367}}</ref> This corresponds to the modern version of backpropagation, but is not yet named as such.<ref name="grie2012">{{cite journal |last=Griewank |first=Andreas |year=2012 |title=Who Invented the Reverse Mode of Differentiation? |journal=Documenta Matematica, Extra Volume ISMP |pages=389–400}}</ref><ref name="grie2008">Griewank, Andreas and Walther, A. ''Principles and Techniques of Algorithmic Differentiation, Second Edition''. SIAM, 2008.</ref><ref name="schmidhuber2015">{{cite journal |authorlink=Jürgen Schmidhuber |last=Schmidhuber |first=Jürgen |year=2015 |title=Deep learning in neural networks: An overview |journal=Neural Networks |volume=61 |pages=85–117 |arxiv=1404.7828}}</ref><ref name="scholarpedia2015">[[Jürgen Schmidhuber|Schmidhuber, Jürgen]] (2015). Deep Learning. Scholarpedia, 10(11):32832. [http://www.scholarpedia.org/article/Deep_Learning#Backpropagation Section on Backpropagation]</ref>
|-
|-
|1972
|1972
|Discovery
|Discovery
|Term frequency–inverse document frequency (TF-IDF)
|Term frequency–inverse document frequency (TF-IDF)
|[[Karen Spärck Jones]] publishes the concept of [[Tf–idf|TF-IDF]], a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus<ref>{{Cite journal|date=1973-11-01|title=Index term weighting|url=https://www.sciencedirect.com/science/article/pii/0020027173900430|journal=Information Storage and Retrieval|language=en|volume=9|issue=11|pages=619–633|doi=10.1016/0020-0271(73)90043-0|issn=0020-0271}}</ref>. 83% of text-based recommender systems in the domain of digital libraries use tf-idf<ref>{{Cite journal|last=Beel|first=Joeran|last2=Gipp|first2=Bela|last3=Langer|first3=Stefan|last4=Breitinger|first4=Corinna|date=2016-11-01|title=Research-paper recommender systems: a literature survey|url=https://link.springer.com/article/10.1007/s00799-015-0156-0|journal=International Journal on Digital Libraries|language=en|volume=17|issue=4|pages=305–338|doi=10.1007/s00799-015-0156-0|issn=1432-5012}}</ref>.
|[[Karen Spärck Jones]] publishes the concept of [[Tf–idf|TF-IDF]], a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.<ref>{{Cite journal | last1 = Spärck Jones | first1 = K. | authorlink1 = Karen Spärck Jones| title = Index term weighting | doi = 10.1016/0020-0271(73)90043-0 | journal = Information Storage and Retrieval | volume = 9 | issue = 11 | pages = 619–633 | year = 1973 |url=https://www.sciencedirect.com/science/article/pii/0020027173900430}}</ref> 83% of text-based recommender systems in the domain of digital libraries use tf-idf.<ref>{{Cite journal|last=Beel|first=Joeran|last2=Gipp|first2=Bela|last3=Langer|first3=Stefan|last4=Breitinger|first4=Corinna|date=2016-11-01|title=Research-paper recommender systems: a literature survey|url=https://link.springer.com/article/10.1007/s00799-015-0156-0|journal=International Journal on Digital Libraries|language=en|volume=17|issue=4|pages=305–338|doi=10.1007/s00799-015-0156-0|issn=1432-5012}}</ref>
|-
|-
| 1979 || || Stanford Cart || Students at Stanford University develop a cart that can navigate and avoid obstacles in a room.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref>
| 1979 || || Stanford Cart || Students at Stanford University develop a cart that can navigate and avoid obstacles in a room.<ref name="Marr" />
|-
|-
| 1980 || Discovery || Neocognitron || [[Kunihiko Fukushima]] first publishes his work on the [[neocognitron]], a type of [[artificial neural network]] (ANN).<ref>{{cite journal|last1=Fukushima|first1=Kunihiko|title=Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern The Recognitron Unaffected by Shift in Position|journal=Biological Cybernetics|date=1980|volume=36|pages=193–202|url=http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf|accessdate=5 June 2016|doi=10.1007/bf00344251|pmid=7370364}}</ref> [[Neocognitron|Neocognition]] later inspires [[convolutional neural network]]s (CNNs).<ref>{{cite web|last1=Le Cun|first1=Yann|title=Deep Learning|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.297.6176&rep=rep1&type=pdf|accessdate=5 June 2016}}</ref>
| 1980 || Discovery || Neocognitron || [[Kunihiko Fukushima]] first publishes his work on the [[neocognitron]], a type of [[artificial neural network]] (ANN).<ref>{{cite journal|last1=Fukushima|first1=Kunihiko|title=Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern The Recognitron Unaffected by Shift in Position|journal=Biological Cybernetics|date=1980|volume=36|pages=193–202|url=http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf|accessdate=5 June 2016|doi=10.1007/bf00344251|pmid=7370364}}</ref> [[Neocognitron|Neocognition]] later inspires [[convolutional neural network]]s (CNNs).<ref>{{cite journal|last1=Le Cun|first1=Yann|title=Deep Learning|citeseerx=10.1.1.297.6176}}</ref>
|-
|-
| 1981 || || Explanation Based Learning || Gerald Dejong introduces Explanation Based Learning, where a computer algorithm analyses data and creates a general rule it can follow and discard unimportant data.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref>
| 1981 || || Explanation Based Learning || Gerald Dejong introduces Explanation Based Learning, where a computer algorithm analyses data and creates a general rule it can follow and discard unimportant data.<ref name="Marr" />
|-
|-
| 1982 || Discovery || Recurrent Neural Network || [[John Hopfield]] popularizes [[Hopfield networks]], a type of [[recurrent neural network]] that can serve as [[content-addressable memory]] systems.<ref>{{cite journal|last1=Hopfield|first1=John|title=Neural networks and physical systems with emergent collective computational abilities|journal=Proceedings of the National Academy of Sciences of the United States of America|date=April 1982|volume=79|pages=2554–2558|url=http://www.pnas.org/content/79/8/2554.full.pdf|accessdate=8 June 2016|doi=10.1073/pnas.79.8.2554|pmid=6953413|pmc=346238}}</ref>
| 1982 || Discovery || Recurrent Neural Network || [[John Hopfield]] popularizes [[Hopfield networks]], a type of [[recurrent neural network]] that can serve as [[content-addressable memory]] systems.<ref>{{cite journal|last1=Hopfield|first1=John|title=Neural networks and physical systems with emergent collective computational abilities|journal=Proceedings of the National Academy of Sciences of the United States of America|date=April 1982|volume=79|pages=2554–2558|url=http://www.pnas.org/content/79/8/2554.full.pdf|accessdate=8 June 2016|doi=10.1073/pnas.79.8.2554|pmid=6953413|pmc=346238}}</ref>
|-
|-
| 1985 || || NetTalk || A program that learns to pronounce words the same way a baby does, is developed by Terry Sejnowski.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref>
| 1985 || || NetTalk || A program that learns to pronounce words the same way a baby does, is developed by Terry Sejnowski.<ref name="Marr" />
|-
|-
| 1986 || Discovery || Backpropagation || The process of [[backpropagation]] is described by [[David Rumelhart]], [[Geoff Hinton]] and [[Ronald J. Williams]].<ref>{{cite journal|last1=Rumelhart|first1=David|last2=Hinton|first2=Geoffrey|last3=Williams|first3=Ronald|title=Learning representations by back-propagating errors|journal=Nature|date=9 October 1986|volume=323|pages=533–536|url=http://elderlab.yorku.ca/~elder/teaching/cosc6390psyc6225/readings/hinton%201986.pdf|accessdate=5 June 2016|doi=10.1038/323533a0}}</ref>
| 1986 || Discovery || Backpropagation || The process of [[backpropagation]] is described by [[David Rumelhart]], [[Geoff Hinton]] and [[Ronald J. Williams]].<ref>{{cite journal|last1=Rumelhart|first1=David|last2=Hinton|first2=Geoffrey|last3=Williams|first3=Ronald|title=Learning representations by back-propagating errors|journal=Nature|date=9 October 1986|volume=323|pages=533–536|url=http://elderlab.yorku.ca/~elder/teaching/cosc6390psyc6225/readings/hinton%201986.pdf|accessdate=5 June 2016|doi=10.1038/323533a0}}</ref>
Line 83: Line 83:
| 1995 || Discovery || Support Vector Machines || [[Corinna Cortes]] and [[Vladimir Vapnik]] publish their work on [[support vector machines]].<ref name="bhml">{{cite web|last1=Golge|first1=Eren|title=BRIEF HISTORY OF MACHINE LEARNING|url=http://www.erogol.com/brief-history-machine-learning/|website=A Blog From a Human-engineer-being|accessdate=5 June 2016}}</ref><ref>{{cite journal|last1=Cortes|first1=Corinna|last2=Vapnik|first2=Vladimir|title=Support-vector networks|journal=Machine Learning|date=September 1995|volume=20|issue=3|pages=273–297|doi=10.1007/BF00994018|url=http://download.springer.com/static/pdf/467/art%253A10.1007%252FBF00994018.pdf?originUrl=http%3A%2F%2Flink.springer.com%2Farticle%2F10.1007%2FBF00994018&token2=exp=1465109699~acl=%2Fstatic%2Fpdf%2F467%2Fart%25253A10.1007%25252FBF00994018.pdf%3ForiginUrl%3Dhttp%253A%252F%252Flink.springer.com%252Farticle%252F10.1007%252FBF00994018*~hmac=133f5211871b237411d6dcc05047fc16cdc99abc25ab4e74be863808ea53bfd7|accessdate=5 June 2016|publisher=Kluwer Academic Publishers|issn=0885-6125}}</ref>
| 1995 || Discovery || Support Vector Machines || [[Corinna Cortes]] and [[Vladimir Vapnik]] publish their work on [[support vector machines]].<ref name="bhml">{{cite web|last1=Golge|first1=Eren|title=BRIEF HISTORY OF MACHINE LEARNING|url=http://www.erogol.com/brief-history-machine-learning/|website=A Blog From a Human-engineer-being|accessdate=5 June 2016}}</ref><ref>{{cite journal|last1=Cortes|first1=Corinna|last2=Vapnik|first2=Vladimir|title=Support-vector networks|journal=Machine Learning|date=September 1995|volume=20|issue=3|pages=273–297|doi=10.1007/BF00994018|url=http://download.springer.com/static/pdf/467/art%253A10.1007%252FBF00994018.pdf?originUrl=http%3A%2F%2Flink.springer.com%2Farticle%2F10.1007%2FBF00994018&token2=exp=1465109699~acl=%2Fstatic%2Fpdf%2F467%2Fart%25253A10.1007%25252FBF00994018.pdf%3ForiginUrl%3Dhttp%253A%252F%252Flink.springer.com%252Farticle%252F10.1007%252FBF00994018*~hmac=133f5211871b237411d6dcc05047fc16cdc99abc25ab4e74be863808ea53bfd7|accessdate=5 June 2016|publisher=Kluwer Academic Publishers|issn=0885-6125}}</ref>
|-
|-
| 1997 || Achievement || IBM Deep Blue Beats Kasparov || IBM’s [[Deep_Blue_(chess_computer)|Deep Blue]] beats the world champion at chess.<ref>{{cite web|last1=Marr|first1=Marr|title=A Short History of Machine Learning - Every Manager Should Read|url=https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#2a1a75f9323f|website=Forbes|accessdate=28 Sep 2016}}</ref>
| 1997 || Achievement || IBM Deep Blue Beats Kasparov || IBM's [[Deep Blue (chess computer)|Deep Blue]] beats the world champion at chess.<ref name="Marr" />
|-
|-
| 1997 || Discovery || LSTM || [[Sepp Hochreiter]] and [[Jürgen Schmidhuber]] invent [[long short-term memory]] (LSTM) recurrent neural networks,<ref>{{cite journal|last1=Hochreiter|first1=Sepp|last2=Schmidhuber|first2=Jürgen|title=LONG SHORT-TERM MEMORY|journal=Neural Computation|date=1997|volume=9|issue=8|pages=1735–1780|url=http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf|doi=10.1162/neco.1997.9.8.1735|pmid=9377276}}</ref> greatly improving the efficiency and practicality of recurrent neural networks.
| 1997 || Discovery || LSTM || [[Sepp Hochreiter]] and [[Jürgen Schmidhuber]] invent [[long short-term memory]] (LSTM) recurrent neural networks,<ref>{{cite journal|last1=Hochreiter|first1=Sepp|last2=Schmidhuber|first2=Jürgen|title=LONG SHORT-TERM MEMORY|journal=Neural Computation|date=1997|volume=9|issue=8|pages=1735–1780|url=http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf|doi=10.1162/neco.1997.9.8.1735|pmid=9377276}}</ref> greatly improving the efficiency and practicality of recurrent neural networks.
Line 96: Line 96:
|Achievement
|Achievement
|ImageNet
|ImageNet
|[[ImageNet]] is created. ImageNet is a large visual database envisioned by [[Fei-Fei Li]] from Stanford University, who realized that the best machine learning algorithms wouldn't work well if the data didn't reflect the real world<ref>{{Cite web|url=https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world/|title=ImageNet: the data that spawned the current AI boom — Quartz|last=Gershgorn|first=Dave|website=qz.com|language=en-US|access-date=2018-03-30}}</ref>. For many, ImageNet was the catalyst for the AI boom<ref>{{Cite news|url=https://www.nytimes.com/2016/07/19/technology/reasons-to-believe-the-ai-boom-is-real.html|title=Reasons to Believe the A.I. Boom Is Real|last=Hardy|first=Quentin|date=2016-07-18|work=The New York Times|access-date=2018-03-30|language=en-US|issn=0362-4331}}</ref>of the 21st century.
|[[ImageNet]] is created. ImageNet is a large visual database envisioned by [[Fei-Fei Li]] from Stanford University, who realized that the best machine learning algorithms wouldn't work well if the data didn't reflect the real world.<ref>{{Cite web|url=https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world/|title=ImageNet: the data that spawned the current AI boom — Quartz|last=Gershgorn|first=Dave|website=qz.com|language=en-US|access-date=2018-03-30}}</ref> For many, ImageNet was the catalyst for the AI boom<ref>{{Cite news|url=https://www.nytimes.com/2016/07/19/technology/reasons-to-believe-the-ai-boom-is-real.html|title=Reasons to Believe the A.I. Boom Is Real|last=Hardy|first=Quentin|date=2016-07-18|work=The New York Times|access-date=2018-03-30|language=en-US|issn=0362-4331}}</ref>of the 21st century.
|-
|-
| 2010 || || Kaggle Competition || [[Kaggle]], a website that serves as a platform for machine learning competitions, is launched.<ref>{{cite web|title=About|url=https://www.kaggle.com/about|website=Kaggle|publisher=Kaggle Inc|accessdate=16 June 2016}}</ref>
| 2010 || || Kaggle Competition || [[Kaggle]], a website that serves as a platform for machine learning competitions, is launched.<ref>{{cite web|title=About|url=https://www.kaggle.com/about|website=Kaggle|publisher=Kaggle Inc|accessdate=16 June 2016}}</ref>
|-
|-
| 2011 || Achievement || Beating Humans in Jeopardy || Using a combination of machine learning, [[natural language processing]] and information retrieval techniques, [[IBM]]'s [[Watson (computer)|Watson]] beats two human champions in a [[Jeopardy!]] competition.<ref>{{cite news|last1=Markoff|first1=John|title=Computer Wins on ‘Jeopardy!: Trivial, It’s Not|url=https://www.nytimes.com/2011/02/17/science/17jeopardy-watson.html?pagewanted=all&_r=0|accessdate=5 June 2016|work=New York Times|date=17 February 2011|page=A1}}</ref>
| 2011 || Achievement || Beating Humans in Jeopardy || Using a combination of machine learning, [[natural language processing]] and information retrieval techniques, [[IBM]]'s [[Watson (computer)|Watson]] beats two human champions in a [[Jeopardy!]] competition.<ref>{{cite news|last1=Markoff|first1=John|title=Computer Wins on 'Jeopardy!': Trivial, It's Not|url=https://www.nytimes.com/2011/02/17/science/17jeopardy-watson.html?pagewanted=all&_r=0|accessdate=5 June 2016|work=New York Times|date=17 February 2011|page=A1}}</ref>
|-
|-
| 2012 || Achievement || Recognizing Cats on YouTube || The [[Google Brain]] team, led by [[Andrew Ng]] and [[Jeff Dean (computer scientist) | Jeff Dean]], create a neural network that learns to recognize cats by watching unlabeled images taken from frames of [[YouTube]] videos.<ref>{{cite journal|last1=Le|first1=Quoc|last2=Ranzato|first2=Marc’Aurelio|last3=Monga|first3=Rajat|last4=Devin|first4=Matthieu|last5=Chen|first5=Kai|last6=Corrado|first6=Greg|last7=Dean|first7=Jeff|last8=Ng|first8=Andrew|title=Building High-level Features Using Large Scale Unsupervised Learning|journal=CoRR|date=12 July 2012|arxiv=1112.6209}}</ref><ref>{{cite news|last1=Markoff|first1=John|title=How Many Computers to Identify a Cat? 16,000|url=https://www.nytimes.com/2012/06/26/technology/in-a-big-network-of-computers-evidence-of-machine-learning.html|accessdate=5 June 2016|work=New York Times|date=26 June 2012|page=B1}}</ref>
| 2012 || Achievement || Recognizing Cats on YouTube || The [[Google Brain]] team, led by [[Andrew Ng]] and [[Jeff Dean (computer scientist)|Jeff Dean]], create a neural network that learns to recognize cats by watching unlabeled images taken from frames of [[YouTube]] videos.<ref>{{cite journal|last1=Le|first1=Quoc|last2=Ranzato|first2=Marc'Aurelio|last3=Monga|first3=Rajat|last4=Devin|first4=Matthieu|last5=Chen|first5=Kai|last6=Corrado|first6=Greg|last7=Dean|first7=Jeff|last8=Ng|first8=Andrew|title=Building High-level Features Using Large Scale Unsupervised Learning|journal=CoRR|date=12 July 2012|arxiv=1112.6209}}</ref><ref>{{cite news|last1=Markoff|first1=John|title=How Many Computers to Identify a Cat? 16,000|url=https://www.nytimes.com/2012/06/26/technology/in-a-big-network-of-computers-evidence-of-machine-learning.html|accessdate=5 June 2016|work=New York Times|date=26 June 2012|page=B1}}</ref>
|-
|-
| 2014 || || Leap in Face Recognition || [[Facebook]] researchers publish their work on [[DeepFace]], a system that uses neural networks that identifies faces with 97.35% accuracy. The results are an improvement of more than 27% over previous systems and rivals human performance.<ref>{{cite journal|last1=Taigman|first1=Yaniv|last2=Yang|first2=Ming|last3=Ranzato|first3=Marc’Aurelio|last4=Wolf|first4=Lior|title=DeepFace: Closing the Gap to Human-Level Performance in Face Verification|journal=Conference on Computer Vision and Pattern Recognition|date=24 June 2014|url=https://research.facebook.com/publications/deepface-closing-the-gap-to-human-level-performance-in-face-verification/|accessdate=8 June 2016}}</ref>
| 2014 || || Leap in Face Recognition || [[Facebook]] researchers publish their work on [[DeepFace]], a system that uses neural networks that identifies faces with 97.35% accuracy. The results are an improvement of more than 27% over previous systems and rivals human performance.<ref>{{cite journal|last1=Taigman|first1=Yaniv|last2=Yang|first2=Ming|last3=Ranzato|first3=Marc'Aurelio|last4=Wolf|first4=Lior|title=DeepFace: Closing the Gap to Human-Level Performance in Face Verification|journal=Conference on Computer Vision and Pattern Recognition|date=24 June 2014|url=https://research.facebook.com/publications/deepface-closing-the-gap-to-human-level-performance-in-face-verification/|accessdate=8 June 2016}}</ref>
|-
|-
| 2014 || || Sibyl || Researchers from [[Google]] detail their work on Sibyl,<ref>{{cite web |last1=Canini|first1=Kevin|last2=Chandra|first2=Tushar|last3=Ie|first3=Eugene|last4=McFadden|first4=Jim|last5=Goldman|first5=Ken|last6=Gunter|first6=Mike|last7=Harmsen|first7=Jeremiah|last8=LeFevre|first8=Kristen|last9=Lepikhin|first9=Dmitry|last10=Llinares|first10=Tomas Lloret|last11=Mukherjee|first11=Indraneel|last12=Pereira|first12=Fernando|last13=Redstone|first13=Josh|last14=Shaked|first14=Tal|last15=Singer|first15=Yoram|title=Sibyl: A system for large scale supervised machine learning|url=https://users.soe.ucsc.edu/~niejiazhong/slides/chandra.pdf|website=Jack Baskin School Of Engineering|publisher=UC Santa Cruz|accessdate=8 June 2016}}</ref> a proprietary platform for massively parallel machine learning used internally by Google to make predictions about user behavior and provide recommendations.<ref>{{cite news|last1=Woodie|first1=Alex|title=Inside Sibyl, Google’s Massively Parallel Machine Learning Platform|url=http://www.datanami.com/2014/07/17/inside-sibyl-googles-massively-parallel-machine-learning-platform/|accessdate=8 June 2016|work=Datanami|publisher=Tabor Communications|date=17 July 2014}}</ref>
| 2014 || || Sibyl || Researchers from [[Google]] detail their work on Sibyl,<ref>{{cite web |last1=Canini|first1=Kevin|last2=Chandra|first2=Tushar|last3=Ie|first3=Eugene|last4=McFadden|first4=Jim|last5=Goldman|first5=Ken|last6=Gunter|first6=Mike|last7=Harmsen|first7=Jeremiah|last8=LeFevre|first8=Kristen|last9=Lepikhin|first9=Dmitry|last10=Llinares|first10=Tomas Lloret|last11=Mukherjee|first11=Indraneel|last12=Pereira|first12=Fernando|last13=Redstone|first13=Josh|last14=Shaked|first14=Tal|last15=Singer|first15=Yoram|title=Sibyl: A system for large scale supervised machine learning|url=https://users.soe.ucsc.edu/~niejiazhong/slides/chandra.pdf|website=Jack Baskin School of Engineering|publisher=UC Santa Cruz|accessdate=8 June 2016}}</ref> a proprietary platform for massively parallel machine learning used internally by Google to make predictions about user behavior and provide recommendations.<ref>{{cite news|last1=Woodie|first1=Alex|title=Inside Sibyl, Google's Massively Parallel Machine Learning Platform|url=http://www.datanami.com/2014/07/17/inside-sibyl-googles-massively-parallel-machine-learning-platform/|accessdate=8 June 2016|work=Datanami|publisher=Tabor Communications|date=17 July 2014}}</ref>
|-
|-
| 2016 || Achievement || Beating Humans in Go ||Google's [[AlphaGo]] program becomes the first [[Computer Go]] program to beat an unhandicapped professional human player<ref>{{cite web|title=Google achieves AI 'breakthrough' by beating Go champion|url=http://www.bbc.com/news/technology-35420579|website=BBC News|publisher=BBC|accessdate=5 June 2016|date=27 January 2016}}</ref> using a combination of machine learning and tree search techniques.<ref>{{cite web|title=AlphaGo|url=https://www.deepmind.com/alpha-go.html|website=Google DeepMind|publisher=Google Inc|accessdate=5 June 2016}}</ref> Later improved as [[AlphaGo Zero]] and then in 2017 generalized to Chess and more two-player games with [[AlphaZero]].
| 2016 || Achievement || Beating Humans in Go ||Google's [[AlphaGo]] program becomes the first [[Computer Go]] program to beat an unhandicapped professional human player<ref>{{cite web|title=Google achieves AI 'breakthrough' by beating Go champion|url=http://www.bbc.com/news/technology-35420579|website=BBC News|publisher=BBC|accessdate=5 June 2016|date=27 January 2016}}</ref> using a combination of machine learning and tree search techniques.<ref>{{cite web|title=AlphaGo|url=https://www.deepmind.com/alpha-go.html|website=Google DeepMind|publisher=Google Inc|accessdate=5 June 2016}}</ref> Later improved as [[AlphaGo Zero]] and then in 2017 generalized to Chess and more two-player games with [[AlphaZero]].

Revision as of 17:35, 23 May 2018

This page is a timeline of machine learning. Major discoveries, achievements, milestones and other major events are included.

Overview

Decade Summary
<1950s Statistical methods are discovered and refined.
1950s Pioneering machine learning research is conducted using simple algorithms.
1960s Bayesian methods are introduced for probabilistic inference in machine learning.[1]
1970s 'AI Winter' caused by pessimism about machine learning effectiveness.
1980s Rediscovery of backpropagation causes a resurgence in machine learning research.
1990s Work on machine learning shifts from a knowledge-driven approach to a data-driven approach. Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions – or "learn" – from the results.[2] Support vector machines (SVMs) and recurrent neural networks (RNNs) become popular.
2000s Kernel methods grow in popularity,[3] and competitive machine learning becomes more widespread.[4]
2010s Deep learning becomes feasible, which leads to machine learning becoming integral to many widely used software services and applications.

Timeline

A simple neural network with two input units and one output unit


Year Event type Caption Event
1763 Discovery The Underpinnings of Bayes' Theorem Thomas Bayes's work An Essay towards solving a Problem in the Doctrine of Chances is published two years after his death, having been amended and edited by a friend of Bayes, Richard Price.[5] The essay presents work which underpins Bayes theorem.
1805 Discovery Least Squares Adrien-Marie Legendre describes the "méthode des moindres carrés", known in English as the least squares method.[6] The least squares method is used widely in data fitting.
1812 Bayes' Theorem Pierre-Simon Laplace publishes Théorie Analytique des Probabilités, in which he expands upon the work of Bayes and defines what is now known as Bayes' Theorem.[7]
1913 Discovery Markov Chains Andrey Markov first describes techniques he used to analyse a poem. The techniques later become known as Markov chains.[8]
1950 Turing's Learning Machine Alan Turing proposes a 'learning machine' that could learn and become artificially intelligent. Turing's specific proposal foreshadows genetic algorithms.[9]
1951 First Neural Network Machine Marvin Minsky and Dean Edmonds build the first neural network machine, able to learn, the SNARC.[10]
1952 Machines Playing Checkers Arthur Samuel joins IBM's Poughkeepsie Laboratory and begins working on some of the very first machine learning programs, first creating programs that play checkers.[11]
1957 Discovery Perceptron Frank Rosenblatt invents the perceptron while working at the Cornell Aeronautical Laboratory.[12] The invention of the perceptron generated a great deal of excitement and was widely covered in the media.[13]
1963 Achievement Machines Playing Tic-Tac-Toe Donald Michie creates a 'machine' consisting of 304 match boxes and beads, which uses reinforcement learning to play Tic-tac-toe (also known as noughts and crosses).[14]
1967 Nearest Neighbor The nearest neighbor algorithm was created, which is the start of basic pattern recognition. The algorithm was used to map routes.[2]
1969 Limitations of Neural Networks Marvin Minsky and Seymour Papert publish their book Perceptrons, describing some of the limitations of perceptrons and neural networks. The interpretation that the book shows that neural networks are fundamentally limited is seen as a hindrance for research into neural networks.[15][16]
1970 Automatic Differentation (Backpropagation) Seppo Linnainmaa publishes the general method for automatic differentiation (AD) of discrete connected networks of nested differentiable functions.[17][18] This corresponds to the modern version of backpropagation, but is not yet named as such.[19][20][21][22]
1972 Discovery Term frequency–inverse document frequency (TF-IDF) Karen Spärck Jones publishes the concept of TF-IDF, a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.[23] 83% of text-based recommender systems in the domain of digital libraries use tf-idf.[24]
1979 Stanford Cart Students at Stanford University develop a cart that can navigate and avoid obstacles in a room.[2]
1980 Discovery Neocognitron Kunihiko Fukushima first publishes his work on the neocognitron, a type of artificial neural network (ANN).[25] Neocognition later inspires convolutional neural networks (CNNs).[26]
1981 Explanation Based Learning Gerald Dejong introduces Explanation Based Learning, where a computer algorithm analyses data and creates a general rule it can follow and discard unimportant data.[2]
1982 Discovery Recurrent Neural Network John Hopfield popularizes Hopfield networks, a type of recurrent neural network that can serve as content-addressable memory systems.[27]
1985 NetTalk A program that learns to pronounce words the same way a baby does, is developed by Terry Sejnowski.[2]
1986 Discovery Backpropagation The process of backpropagation is described by David Rumelhart, Geoff Hinton and Ronald J. Williams.[28]
1989 Discovery Reinforcement Learning Christopher Watkins develops Q-learning, which greatly improves the practicality and feasibility of reinforcement learning.[29]
1989 Commercialization Commercialization of Machine Learning on Personal Computers Axcelis, Inc. releases Evolver, the first software package to commercialize the use of genetic algorithms on personal computers.[30]
1992 Achievement Machines Playing Backgammon Gerald Tesauro develops TD-Gammon, a computer backgammon program that uses an artificial neural network trained using temporal-difference learning (hence the 'TD' in the name). TD-Gammon is able to rival, but not consistently surpass, the abilities of top human backgammon players.[31]
1995 Discovery Random Forest Algorithm Tin Kam Ho publishes a paper describing random decision forests.[32]
1995 Discovery Support Vector Machines Corinna Cortes and Vladimir Vapnik publish their work on support vector machines.[33][34]
1997 Achievement IBM Deep Blue Beats Kasparov IBM's Deep Blue beats the world champion at chess.[2]
1997 Discovery LSTM Sepp Hochreiter and Jürgen Schmidhuber invent long short-term memory (LSTM) recurrent neural networks,[35] greatly improving the efficiency and practicality of recurrent neural networks.
1998 MNIST database A team led by Yann LeCun releases the MNIST database, a dataset comprising a mix of handwritten digits from American Census Bureau employees and American high school students.[36] The MNIST database has since become a benchmark for evaluating handwriting recognition.
2002 Torch Machine Learning Library Torch, a software library for machine learning, is first released.[37]
2006 The Netflix Prize The Netflix Prize competition is launched by Netflix. The aim of the competition was to use machine learning to beat Netflix's own recommendation software's accuracy in predicting a user's rating for a film given their ratings for previous films by at least 10%.[38] The prize was won in 2009.
2009 Achievement ImageNet ImageNet is created. ImageNet is a large visual database envisioned by Fei-Fei Li from Stanford University, who realized that the best machine learning algorithms wouldn't work well if the data didn't reflect the real world.[39] For many, ImageNet was the catalyst for the AI boom[40]of the 21st century.
2010 Kaggle Competition Kaggle, a website that serves as a platform for machine learning competitions, is launched.[41]
2011 Achievement Beating Humans in Jeopardy Using a combination of machine learning, natural language processing and information retrieval techniques, IBM's Watson beats two human champions in a Jeopardy! competition.[42]
2012 Achievement Recognizing Cats on YouTube The Google Brain team, led by Andrew Ng and Jeff Dean, create a neural network that learns to recognize cats by watching unlabeled images taken from frames of YouTube videos.[43][44]
2014 Leap in Face Recognition Facebook researchers publish their work on DeepFace, a system that uses neural networks that identifies faces with 97.35% accuracy. The results are an improvement of more than 27% over previous systems and rivals human performance.[45]
2014 Sibyl Researchers from Google detail their work on Sibyl,[46] a proprietary platform for massively parallel machine learning used internally by Google to make predictions about user behavior and provide recommendations.[47]
2016 Achievement Beating Humans in Go Google's AlphaGo program becomes the first Computer Go program to beat an unhandicapped professional human player[48] using a combination of machine learning and tree search techniques.[49] Later improved as AlphaGo Zero and then in 2017 generalized to Chess and more two-player games with AlphaZero.

See also

References

  1. ^ Solomonoff, Ray J. "A formal theory of inductive inference. Part II." Information and control 7.2 (1964): 224–254.
  2. ^ a b c d e f Marr, Bernard. "A Short History of Machine Learning – Every Manager Should Read". Forbes. Retrieved 28 Sep 2016.
  3. ^ Hofmann, Thomas; Schölkopf, Bernhard; Smola, Alexander J. (2008). "Kernel methods in machine learning". The Annals of Statistics. 36 (3): 1171–1220. JSTOR 25464664.
  4. ^ Bennett, James; Lanning, Stan (2007). "The netflix prize" (PDF). Proceedings of KDD Cup and Workshop 2007.
  5. ^ Bayes, Thomas (1 January 1763). "An Essay towards solving a Problem in the Doctrine of Chance" (PDF). Philosophical Transactions. 53: 370–418. doi:10.1098/rstl.1763.0053. JSTOR 105741. Retrieved 15 June 2016.
  6. ^ Legendre, Adrien-Marie (1805). Nouvelles méthodes pour la détermination des orbites des comètes (in French). Paris: Firmin Didot. p. viii. Retrieved 13 June 2016.
  7. ^ O'Connor, J J; Robertson, E F. "Pierre-Simon Laplace". School of Mathematics and Statistics, University of St Andrews, Scotland. Retrieved 15 June 2016.
  8. ^ Hayes, Brian. "First Links in the Markov Chain". American Scientist. 101 (March–April 2013). Sigma Xi, The Scientific Research Society: 92. doi:10.1511/2013.101.1. Retrieved 15 June 2016. Delving into the text of Alexander Pushkin's novel in verse Eugene Onegin, Markov spent hours sifting through patterns of vowels and consonants. On January 23, 1913, he summarized his findings in an address to the Imperial Academy of Sciences in St. Petersburg. His analysis did not alter the understanding or appreciation of Pushkin's poem, but the technique he developed—now known as a Markov chain—extended the theory of probability in a new direction.
  9. ^ Turing, Alan (October 1950). "Computing Machinery and Intelligence". Mind. 59 (236): 433–460. doi:10.1093/mind/LIX.236.433. Retrieved 8 June 2016.
  10. ^ Crevier 1993, pp. 34–35 and Russell & Norvig 2003, p. 17
  11. ^ McCarthy, John; Feigenbaum, Ed. "Arthur Samuel: Pioneer in Machine Learning". AI Magazine. No. 3. Association for the Advancement of Artificial Intelligence. p. 10. Retrieved 5 June 2016.
  12. ^ Rosenblatt, Frank (1958). "The perceptron: A probabilistic model for information storage and organization in the brain" (PDF). Psychological Review. 65 (6): 386–408. doi:10.1037/h0042519.
  13. ^ Mason, Harding; Stewart, D; Gill, Brendan (6 December 1958). "Rival". The New Yorker. Retrieved 5 June 2016.
  14. ^ Child, Oliver. "Menace: the Machine Educable Noughts And Crosses Engine Read". Chalkdust Magazine. Retrieved 16 Jan 2018.
  15. ^ Cohen, Harvey. "The Perceptron". Retrieved 5 June 2016.
  16. ^ Colner, Robert. "A brief history of machine learning". SlideShare. Retrieved 5 June 2016.
  17. ^ Seppo Linnainmaa (1970). "The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors." Master's Thesis (in Finnish), Univ. Helsinki, 6–7.
  18. ^ Linnainmaa, Seppo (1976). "Taylor expansion of the accumulated rounding error". BIT Numerical Mathematics. 16 (2): 146–160. doi:10.1007/BF01931367.
  19. ^ Griewank, Andreas (2012). "Who Invented the Reverse Mode of Differentiation?". Documenta Matematica, Extra Volume ISMP: 389–400.
  20. ^ Griewank, Andreas and Walther, A. Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008.
  21. ^ Schmidhuber, Jürgen (2015). "Deep learning in neural networks: An overview". Neural Networks. 61: 85–117. arXiv:1404.7828.
  22. ^ Schmidhuber, Jürgen (2015). Deep Learning. Scholarpedia, 10(11):32832. Section on Backpropagation
  23. ^ Spärck Jones, K. (1973). "Index term weighting". Information Storage and Retrieval. 9 (11): 619–633. doi:10.1016/0020-0271(73)90043-0.
  24. ^ Beel, Joeran; Gipp, Bela; Langer, Stefan; Breitinger, Corinna (2016-11-01). "Research-paper recommender systems: a literature survey". International Journal on Digital Libraries. 17 (4): 305–338. doi:10.1007/s00799-015-0156-0. ISSN 1432-5012.
  25. ^ Fukushima, Kunihiko (1980). "Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern The Recognitron Unaffected by Shift in Position" (PDF). Biological Cybernetics. 36: 193–202. doi:10.1007/bf00344251. PMID 7370364. Retrieved 5 June 2016.
  26. ^ Le Cun, Yann. "Deep Learning". CiteSeerX 10.1.1.297.6176. {{cite journal}}: Cite journal requires |journal= (help)
  27. ^ Hopfield, John (April 1982). "Neural networks and physical systems with emergent collective computational abilities" (PDF). Proceedings of the National Academy of Sciences of the United States of America. 79: 2554–2558. doi:10.1073/pnas.79.8.2554. PMC 346238. PMID 6953413. Retrieved 8 June 2016.
  28. ^ Rumelhart, David; Hinton, Geoffrey; Williams, Ronald (9 October 1986). "Learning representations by back-propagating errors" (PDF). Nature. 323: 533–536. doi:10.1038/323533a0. Retrieved 5 June 2016.
  29. ^ Watksin, Christopher (1 May 1989). "Learning from Delayed Rewards" (PDF). {{cite journal}}: Cite journal requires |journal= (help)
  30. ^ Markoff, John (29 August 1990). "BUSINESS TECHNOLOGY; What's the Best Answer? It's Survival of the Fittest". New York Times. Retrieved 8 June 2016.
  31. ^ Tesauro, Gerald (March 1995). "Temporal Difference Learning and TD-Gammon". Communications of the ACM. 38 (3). doi:10.1145/203330.203343.
  32. ^ Ho, Tin Kam (August 1995). "Random Decision Forests" (PDF). Proceedings of the Third International Conference on Document Analysis and Recognition. 1. Montreal, Quebec: IEEE: 278–282. doi:10.1109/ICDAR.1995.598994. ISBN 0-8186-7128-9. Retrieved 5 June 2016.
  33. ^ Golge, Eren. "BRIEF HISTORY OF MACHINE LEARNING". A Blog From a Human-engineer-being. Retrieved 5 June 2016.
  34. ^ Cortes, Corinna; Vapnik, Vladimir (September 1995). "Support-vector networks" (PDF). Machine Learning. 20 (3). Kluwer Academic Publishers: 273–297. doi:10.1007/BF00994018. ISSN 0885-6125. Retrieved 5 June 2016.
  35. ^ Hochreiter, Sepp; Schmidhuber, Jürgen (1997). "LONG SHORT-TERM MEMORY" (PDF). Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. PMID 9377276.
  36. ^ LeCun, Yann; Cortes, Corinna; Burges, Christopher. "THE MNIST DATABASE of handwritten digits". Retrieved 16 June 2016.
  37. ^ Collobert, Ronan; Benigo, Samy; Mariethoz, Johnny (30 October 2002). "Torch: a modular machine learning software library" (PDF). Retrieved 5 June 2016. {{cite journal}}: Cite journal requires |journal= (help)
  38. ^ "The Netflix Prize Rules". Netflix Prize. Netflix. Retrieved 16 June 2016.
  39. ^ Gershgorn, Dave. "ImageNet: the data that spawned the current AI boom — Quartz". qz.com. Retrieved 2018-03-30.
  40. ^ Hardy, Quentin (2016-07-18). "Reasons to Believe the A.I. Boom Is Real". The New York Times. ISSN 0362-4331. Retrieved 2018-03-30.
  41. ^ "About". Kaggle. Kaggle Inc. Retrieved 16 June 2016.
  42. ^ Markoff, John (17 February 2011). "Computer Wins on 'Jeopardy!': Trivial, It's Not". New York Times. p. A1. Retrieved 5 June 2016.
  43. ^ Le, Quoc; Ranzato, Marc'Aurelio; Monga, Rajat; Devin, Matthieu; Chen, Kai; Corrado, Greg; Dean, Jeff; Ng, Andrew (12 July 2012). "Building High-level Features Using Large Scale Unsupervised Learning". CoRR. arXiv:1112.6209.
  44. ^ Markoff, John (26 June 2012). "How Many Computers to Identify a Cat? 16,000". New York Times. p. B1. Retrieved 5 June 2016.
  45. ^ Taigman, Yaniv; Yang, Ming; Ranzato, Marc'Aurelio; Wolf, Lior (24 June 2014). "DeepFace: Closing the Gap to Human-Level Performance in Face Verification". Conference on Computer Vision and Pattern Recognition. Retrieved 8 June 2016.
  46. ^ Canini, Kevin; Chandra, Tushar; Ie, Eugene; McFadden, Jim; Goldman, Ken; Gunter, Mike; Harmsen, Jeremiah; LeFevre, Kristen; Lepikhin, Dmitry; Llinares, Tomas Lloret; Mukherjee, Indraneel; Pereira, Fernando; Redstone, Josh; Shaked, Tal; Singer, Yoram. "Sibyl: A system for large scale supervised machine learning" (PDF). Jack Baskin School of Engineering. UC Santa Cruz. Retrieved 8 June 2016.
  47. ^ Woodie, Alex (17 July 2014). "Inside Sibyl, Google's Massively Parallel Machine Learning Platform". Datanami. Tabor Communications. Retrieved 8 June 2016.
  48. ^ "Google achieves AI 'breakthrough' by beating Go champion". BBC News. BBC. 27 January 2016. Retrieved 5 June 2016.
  49. ^ "AlphaGo". Google DeepMind. Google Inc. Retrieved 5 June 2016.