Jürgen Schmidhuber

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Jürgen Schmidhuber
Jürgen Schmidhuber.jpg
Jürgen Schmidhuber speaking at the AI for GOOD Global Summit in 2017
Born17 January 1963[1]
Alma materTechnische Universität München
Known forArtificial intelligence, deep learning, artificial neural networks, recurrent neural networks, Gödel machine, artificial curiosity, meta-learning
Scientific career
FieldsArtificial intelligence
InstitutionsDalle Molle Institute for Artificial Intelligence Research

Jürgen Schmidhuber (born 17 January 1963)[1] is a computer scientist most noted for his work in the field of artificial intelligence, deep learning and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artificial Intelligence Research in Manno, in the district of Lugano, in Ticino in southern Switzerland.[2] He is sometimes called the "father of (modern) AI."[3][4][5][6]

Schmidhuber did his undergraduate studies at the Technische Universität München in Munich, Germany.[1] He taught there from 2004 until 2009 when he became a professor of artificial intelligence at the Università della Svizzera Italiana in Lugano, Switzerland.[7]


With his students Sepp Hochreiter, Felix Gers, Fred Cummins, Alex Graves, and others, Schmidhuber published increasingly sophisticated versions of a type of recurrent neural network called the long short-term memory (LSTM). First results were already reported in Hochreiter's diploma thesis (1991) which analyzed and overcame the famous vanishing gradient problem.[8] The name LSTM was introduced in a tech report (1995) leading to the most cited LSTM publication (1997).[9]

The standard LSTM architecture which is used in almost all current applications was introduced in 2000.[10] Today's "vanilla LSTM" using backpropagation through time was published in 2005,[11][12] and its connectionist temporal classification (CTC) training algorithm[13] in 2006. CTC enabled end-to-end speech recognition with LSTM. In 2015, LSTM trained by CTC was used in a new implementation of speech recognition in Google's software for smartphones.[2] Google also used LSTM for the smart assistant Allo[14] and for Google Translate.[15][16] Apple used LSTM for the "Quicktype" function on the iPhone[17][18] and for Siri.[19] Amazon used LSTM for Amazon Alexa.[20] In 2017, Facebook performed some 4.5 billion automatic translations every day using LSTM networks.[21] Bloomberg Business Week wrote: "These powers make LSTM arguably the most commercial AI achievement, used for everything from predicting diseases to composing music."[22]

In 2011, Schmidhuber's team at IDSIA with his postdoc Dan Ciresan also achieved dramatic speedups of convolutional neural networks (CNNs) on fast parallel computers called GPUs. An earlier CNN on GPU by Chellapilla et al. (2006) was 4 times faster than an equivalent implementation on CPU.[23] The deep CNN of Dan Ciresan et al. (2011) at IDSIA was already 60 times faster[24] and achieved the first superhuman performance in a computer vision contest in August 2011.[25] Between 15 May 2011 and 10 September 2012, their fast and deep CNNs won no fewer than four image competitions.[26][27] They also significantly improved on the best performance in the literature for multiple image databases.[28] The approach has become central to the field of computer vision.[27] It is based on CNN designs introduced much earlier by Yann LeCun et al. (1989)[29] who applied the backpropagation algorithm to a variant of Kunihiko Fukushima's original CNN architecture called neocognitron,[30] later modified by J. Weng's method called max-pooling.[31][27]

In 2014, Schmidhuber formed a company, Nnaisense, to work on commercial applications of artificial intelligence in fields such as finance, heavy industry and self-driving cars. Sepp Hochreiter, Jaan Tallinn, and Marcus Hutter are advisers to the company.[2] Sales were under US$11 million in 2016; however, Schmidhuber states that the current emphasis is on research and not revenue. Nnaisense raised its first round of capital funding in January 2017. Schmidhuber's overall goal is to create an all-purpose AI by training a single AI in sequence on a variety of narrow tasks; however, skeptics point out that companies such as Arago GmbH and IBM have applied AI to various different projects for years without showing any signs of artificial general intelligence.[32]


According to The Guardian,[33] Schmidhuber complained in a "scathing 2015 article" that fellow deep learning researchers Geoffrey Hinton, Yann LeCun and Yoshua Bengio "heavily cite each other," but "fail to credit the pioneers of the field", allegedly understating the contributions of Schmidhuber and other early machine learning pioneers including Alexey Grigorevich Ivakhnenko who published the first deep learning networks already in 1965. LeCun denies the charge, stating instead that Schmidhuber "keeps claiming credit he doesn't deserve".[2][33]


Schmidhuber received the Helmholtz Award of the International Neural Network Society in 2013,[34] and the Neural Networks Pioneer Award of the IEEE Computational Intelligence Society in 2016.[35] He is a member of the European Academy of Sciences and Arts.[36][7]


  1. ^ a b c d CV
  2. ^ a b c d John Markoff (27 November 2016). When A.I. Matures, It May Call Jürgen Schmidhuber ‘Dad’. The New York Times. Accessed April 2017.
  3. ^ Wong, Andrew (16 May 2018). "The 'father of A.I' urges humans not to fear the technology". CNBC. Retrieved 27 February 2019.
  4. ^ Blunden, Mark (8 June 2018). "Humans will learn to confide in their robot friends, says AI expert. The father of modern AI believes robots could keep lonely people company". The Evening Standard. Retrieved 27 February 2019.
  5. ^ Micklethwaite, Jamie (17 February 2018). "The day robots become smarter than humans will arrive on THIS DATE. THE day robots become smarter than humans is just round the corner, according to a man dubbed "the father of artificial intelligence"". Daily Star. Retrieved 27 February 2019.
  6. ^ "The 'father of A.I.' urges humans not to fear the technology". South China Morning Post. 16 May 2018. Retrieved 27 February 2019.
  7. ^ a b Dave O'Leary (3 October 2016). The Present and Future of AI and Deep Learning Featuring Professor Jürgen Schmidhuber. IT World Canada. Accessed April 2017.
  8. ^ Hochreiter, S. (1991). Untersuchungen zu dynamischen neuronalen Netzen (PDF) (diploma thesis). Technical University Munich, Institute of Computer Science (advisor Jürgen Schmidhuber).
  9. ^ Sepp Hochreiter; Jürgen Schmidhuber (1997). "Long short-term memory". Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. PMID 9377276. S2CID 1915014.
  10. ^ Felix A. Gers; Jürgen Schmidhuber; Fred Cummins (2000). "Learning to Forget: Continual Prediction with LSTM". Neural Computation. 12 (10): 2451–2471. CiteSeerX doi:10.1162/089976600300015015. PMID 11032042. S2CID 11598600.
  11. ^ Graves, A.; Schmidhuber, J. (2005). "Framewise phoneme classification with bidirectional LSTM and other neural network architectures". Neural Networks. 18 (5–6): 602–610. CiteSeerX doi:10.1016/j.neunet.2005.06.042. PMID 16112549.
  12. ^ Klaus Greff; Rupesh Kumar Srivastava; Jan Koutník; Bas R. Steunebrink; Jürgen Schmidhuber (2015). "LSTM: A Search Space Odyssey". IEEE Transactions on Neural Networks and Learning Systems. 28 (10): 2222–2232. arXiv:1503.04069. Bibcode:2015arXiv150304069G. doi:10.1109/TNNLS.2016.2582924. PMID 27411231. S2CID 3356463.
  13. ^ Graves, Alex; Fernández, Santiago; Gomez, Faustino (2006). "Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks". In Proceedings of the International Conference on Machine Learning, ICML 2006: 369–376. CiteSeerX
  14. ^ Khaitan, Pranav (18 May 2016). "Chat Smarter with Allo". Research Blog. Retrieved 27 June 2017.
  15. ^ Wu, Yonghui; Schuster, Mike; Chen, Zhifeng; Le, Quoc V.; Norouzi, Mohammad; Macherey, Wolfgang; Krikun, Maxim; Cao, Yuan; Gao, Qin (26 September 2016). "Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation". arXiv:1609.08144 [cs.CL].
  16. ^ Metz, Cade (27 September 2016). "An Infusion of AI Makes Google Translate More Powerful Than Ever | WIRED". Wired. Retrieved 27 June 2017.
  17. ^ Efrati, Amir (13 June 2016). "Apple's Machines Can Learn Too". The Information. Retrieved 27 June 2017.
  18. ^ Ranger, Steve (14 June 2016). "iPhone, AI and big data: Here's how Apple plans to protect your privacy | ZDNet". ZDNet. Retrieved 27 June 2017.
  19. ^ Smith, Chris (13 June 2016). "iOS 10: Siri now works in third-party apps, comes with extra AI features". BGR. Retrieved 27 June 2017.
  20. ^ Vogels, Werner (30 November 2016). "Bringing the Magic of Amazon AI and Alexa to Apps on AWS. - All Things Distributed". www.allthingsdistributed.com. Retrieved 27 June 2017.
  21. ^ Ong, Thuy (4 August 2017). "Facebook's translations are now powered completely by AI". www.allthingsdistributed.com. Retrieved 15 February 2019.
  22. ^ Vance, Ashlee (15 May 2018). "Quote: These powers make LSTM arguably the most commercial AI achievement, used for everything from predicting diseases to composing music". Bloomberg Business Week. Retrieved 16 January 2019.
  23. ^ Kumar Chellapilla; Sid Puri; Patrice Simard (2006). "High Performance Convolutional Neural Networks for Document Processing". In Lorette, Guy (ed.). Tenth International Workshop on Frontiers in Handwriting Recognition. Suvisoft.
  24. ^ Ciresan, Dan; Ueli Meier; Jonathan Masci; Luca M. Gambardella; Jurgen Schmidhuber (2011). "Flexible, High Performance Convolutional Neural Networks for Image Classification" (PDF). Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence-Volume Volume Two. 2: 1237–1242. Retrieved 17 November 2013.
  25. ^ "IJCNN 2011 Competition result table". OFFICIAL IJCNN2011 COMPETITION. 2010. Retrieved 14 January 2019.
  26. ^ Schmidhuber, Jürgen (17 March 2017). "History of computer vision contests won by deep CNNs on GPU". Retrieved 14 January 2019.
  27. ^ a b c Schmidhuber, Jürgen (2015). "Deep Learning". Scholarpedia. 10 (11): 1527–54. CiteSeerX doi:10.1162/neco.2006.18.7.1527. PMID 16764513. S2CID 2309950.
  28. ^ Ciresan, Dan; Meier, Ueli; Schmidhuber, Jürgen (June 2012). Multi-column deep neural networks for image classification. 2012 IEEE Conference on Computer Vision and Pattern Recognition. New York, NY: Institute of Electrical and Electronics Engineers (IEEE). pp. 3642–3649. arXiv:1202.2745. CiteSeerX doi:10.1109/CVPR.2012.6248110. ISBN 978-1-4673-1226-4. OCLC 812295155. S2CID 2161592.
  29. ^ Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel, Backpropagation Applied to Handwritten Zip Code Recognition; AT&T Bell Laboratories
  30. ^ Fukushima, Neocognitron (1980). "A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position". Biological Cybernetics. 36 (4): 193–202. doi:10.1007/bf00344251. PMID 7370364. S2CID 206775608.
  31. ^ Weng, J; Ahuja, N; Huang, TS (1993). "Learning recognition and segmentation of 3-D objects from 2-D images". Proc. 4th International Conf. Computer Vision: 121–128.
  32. ^ "AI Pioneer Wants to Build the Renaissance Machine of the Future". Bloomberg.com. 16 January 2017. Retrieved 23 February 2018.
  33. ^ a b Oltermann, Philip (18 April 2017). "Jürgen Schmidhuber on the robot future: 'They will pay as much attention to us as we do to ants'". The Guardian. Retrieved 23 February 2018.
  34. ^ INNS Awards Recipients. International Neural Network Society. Accessed December 2016.
  35. ^ Recipients: Neural Networks Pioneer Award. Piscataway, NJ: IEEE Computational Intelligence Society. Accessed January 2019.]
  36. ^ Members. European Academy of Sciences and Arts. Accessed December 2016.