Convolutional neural network: Difference between revisions

Content deleted Content added

Inline

Revision as of 01:40, 27 December 2013

In computer science, a convolutional neural network is a type of feed-forward neural network where the individual neurons are tiled in such a way that they respond to overlapping regions in the visual field.^[1] Convolutional networks were inspired by biological processes^[2] and are variations of multilayer perceptrons which are designed to use minimal amounts of preprocessing.^[3] They are widely used models for image-recognition.^[4]

Architecture

When used for image recognition, convolutional neural networks consist of multiple layers of small neuron collections which look at small portions of the input image. The results of these collections are then tiled so that they overlap to obtain a better representation of the original image; this is repeated for every such layer. Because of this, they are able to tolerate translation of the input image.^[4] Most convolutional networks include local pooling or max pooling layers, which simplify and combine the outputs of neighboring neurons; essentially, if the outputs are integrated into an image, pooling layers reduce its resolution.^[5] They also consist of various combinations of convolutional layers and fully connected layers, with pointwise nonlinearity applied at the end of or after each layer.^[6]

Some time-delay neural networks also use a very similar architecture to convolutional neural networks, especially those for image recognition and/or classification tasks, since the "tiling" of the neuron outputs can easily be carried out in timed stages in a manner useful for analysis of images.^[7]

History

Convolutional neural networks were introduced in a 1980 paper by Kunihiko Fukushima.^[6]^[8] Their design was later improved in 1998 by Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner,^[9] generalized in 2003 by Sven Behnke,^[10] and simplified by Patrice Simard, David Steinkraus, and John C. Platt in the same year.^[11] In 2011, they were refined by Dan Ciresan et al. and were implemented on a GPU with impressive performance results.^[12] In 2012, Dan Ciresan et al. significantly improved upon the best performance in the literature for multiple image databases, including the MNIST database, the NORB database, the HWDB1.0 dataset (Chinese characters), and the CIFAR10 dataset (a subset of the 80 million tiny objects database).^[6]

Usage

Convolutional neural networks are often used in image recognition systems. When applied to hand tracking in a video stream and gesture recognition, they had almost perfect performance.^[13] They have achieved performance double that of humans on the problem of recognizing traffic signs and an error rate of 0.23 percent on the MNIST database, which is the lowest ever achieved on the database to date.^[6] Another paper on using convolutional neural networks for image classification reported that the learning process of convolutional neural networks was "surprisingly fast"; in the same paper, the best published results at the time were achieved in the MNIST database and the NORB database.^[12]

They have also been confirmed to have lower error rates than both deep neural networks and regular neural networks for large-vocabulary voice recognition tasks.^[14]

When applied to facial recognition, they were able to contribute to a large decrease in error rate.^[15] In another paper, they were able to achieve a 97.6 percent recognition rate on "5,600 still images of more than 10 subjects".^[2] Convolutional neural networks have been used to assess video quality in an objective way after being manually trained; the resulting system had a very low root mean square error.^[7]

References

^ "Convolutional Neural Networks (LeNet) - DeepLearning 0.1 documentation". DeepLearning 0.1. LISA Lab. Retrieved 31 August 2013.
^ ^a ^b Matusugu, Masakazu (2003). "Subject independent facial expression recognition with robust face detection using a convolutional neural network" (PDF). Neural Networks. 16 (5): 555–559. doi:10.1016/S0893-6080(03)00115-1. Retrieved 17 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ LeCun, Yann. "LeNet-5, convolutional neural networks". Retrieved 16 November 2013.
^ ^a ^b Korekado, Keisuke; Morie, Takashi; Nomura, Osamu; Ando, Hiroshi; Nakano, Teppei; Matsugu, Masakazu; Iwata, Atsushi (2003). "A Convolutional Neural Network VLSI for Image Recognition Using Merged/Mixed Analog-Digital Architecture". Knowledge-Based Intelligent Information and Engineering Systems: 169–176. Retrieved 16 November 2013.
^ Krizhevsky, Alex. "ImageNet Classification with Deep Convolutional Neural Networks" (PDF). Retrieved 17 November 2013.
^ ^a ^b ^c ^d Ciresan, Dan; Meier, Ueli; Schmidhuber, Jürgen (2012). "Multi-column deep neural networks for image classification". 2012 IEEE Conference on Computer Vision and Pattern Recognition. New York, NY: Institute of Electrical and Electronics Engineers (IEEE): 3642–3649. arXiv:1202.2745v1. doi:10.1109/CVPR.2012.6248110. ISBN 9781467312264. OCLC 812295155. Retrieved 2013-12-09. {{cite journal}}: Unknown parameter |month= ignored (help)
^ ^a ^b Le Callet, Patrick (2006). "A Convolutional Neural Network Approach for Objective Video Quality Assessment" (PDF). IEEE Transactions on Neural Networks. 17 (5): 1316–1327. doi:10.1109/TNN.2006.879766. PMID 17001990. Retrieved 17 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Fukushima, Kunihiko (1980). "Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position" (PDF). Biological Cybernetics. 36 (4): 193–202. doi:10.1007/BF00344251. PMID 7370364. Retrieved 16 November 2013.
^ LeCun, Yann (1998). "Gradient-based learning applied to document recognition" (PDF). Proceedings of the IEEE. 86 (11): 2278–2324. doi:10.1109/5.726791. Retrieved 16 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ S. Behnke. Hierarchical Neural Networks for Image Interpretation, volume 2766 of Lecture Notes in Computer Science. Springer, 2003.
^ Simard, Patrice, David Steinkraus, and John C. Platt. "Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis." In ICDAR, vol. 3, pp. 958-962. 2003.
^ ^a ^b Ciresan, Dan (2011). "Flexible, High Performance Convolutional Neural Networks for Image Classiﬁcation" (PDF). Proceedings of the Twenty-Second international joint conference on Artificial Intelligence-Volume Volume Two. 2: 1237–1242. Retrieved 17 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Nowlan, Steven J. (1995). "A convolutional neural network hand tracker". Advances in Neural Information Processing Systems: 901–908. Retrieved 16 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Sainath, Tara N. (2013). "Deep convolutional neural networks for LVCSR" (PDF). International Conference on Acoustics, Speech, and Signal Processing: 8614–8618. Retrieved 31 August 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Lawrence, Steve (1997). "Face Recognition: A Convolutional Neural Network Approach". Neural Networks, IEEE Transactions on. 8 (1): 98–113. doi:10.1109/72.554195. Retrieved 16 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

External links

http://yann.lecun.com/exdb/lenet/ - A demonstration of a convolutional network created for character recognition.

[deeplearning-1] "Convolutional Neural Networks (LeNet) - DeepLearning 0.1 documentation". DeepLearning 0.1. LISA Lab. Retrieved 31 August 2013.

[robust_face_detection-2] Matusugu, Masakazu (2003). "Subject independent facial expression recognition with robust face detection using a convolutional neural network" (PDF). Neural Networks. 16 (5): 555–559. doi:10.1016/S0893-6080(03)00115-1. Retrieved 17 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[3] LeCun, Yann. "LeNet-5, convolutional neural networks". Retrieved 16 November 2013.

[vlsi-4] Korekado, Keisuke; Morie, Takashi; Nomura, Osamu; Ando, Hiroshi; Nakano, Teppei; Matsugu, Masakazu; Iwata, Atsushi (2003). "A Convolutional Neural Network VLSI for Image Recognition Using Merged/Mixed Analog-Digital Architecture". Knowledge-Based Intelligent Information and Engineering Systems: 169–176. Retrieved 16 November 2013.

[5] Krizhevsky, Alex. "ImageNet Classification with Deep Convolutional Neural Networks" (PDF). Retrieved 17 November 2013.

[mcdns-6] Ciresan, Dan; Meier, Ueli; Schmidhuber, Jürgen (2012). "Multi-column deep neural networks for image classification". 2012 IEEE Conference on Computer Vision and Pattern Recognition. New York, NY: Institute of Electrical and Electronics Engineers (IEEE): 3642–3649. arXiv:1202.2745v1. doi:10.1109/CVPR.2012.6248110. ISBN 9781467312264. OCLC 812295155. Retrieved 2013-12-09. {{cite journal}}: Unknown parameter |month= ignored (help)

[video_quality-7] Le Callet, Patrick (2006). "A Convolutional Neural Network Approach for Objective Video Quality Assessment" (PDF). IEEE Transactions on Neural Networks. 17 (5): 1316–1327. doi:10.1109/TNN.2006.879766. PMID 17001990. Retrieved 17 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[intro-8] Fukushima, Kunihiko (1980). "Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position" (PDF). Biological Cybernetics. 36 (4): 193–202. doi:10.1007/BF00344251. PMID 7370364. Retrieved 16 November 2013.

[9] LeCun, Yann (1998). "Gradient-based learning applied to document recognition" (PDF). Proceedings of the IEEE. 86 (11): 2278–2324. doi:10.1109/5.726791. Retrieved 16 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[10] S. Behnke. Hierarchical Neural Networks for Image Interpretation, volume 2766 of Lecture Notes in Computer Science. Springer, 2003.

[11] Simard, Patrice, David Steinkraus, and John C. Platt. "Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis." In ICDAR, vol. 3, pp. 958-962. 2003.

[flexible-12] Ciresan, Dan (2011). "Flexible, High Performance Convolutional Neural Networks for Image Classiﬁcation" (PDF). Proceedings of the Twenty-Second international joint conference on Artificial Intelligence-Volume Volume Two. 2: 1237–1242. Retrieved 17 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[13] Nowlan, Steven J. (1995). "A convolutional neural network hand tracker". Advances in Neural Information Processing Systems: 901–908. Retrieved 16 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[14] Sainath, Tara N. (2013). "Deep convolutional neural networks for LVCSR" (PDF). International Conference on Acoustics, Speech, and Signal Processing: 8614–8618. Retrieved 31 August 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[15] Lawrence, Steve (1997). "Face Recognition: A Convolutional Neural Network Approach". Neural Networks, IEEE Transactions on. 8 (1): 98–113. doi:10.1109/72.554195. Retrieved 16 November 2013. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

@@ Line 1: / Line 1: @@
-In computer science, a '''convolutional neural network''' is a type of [[Feedforward neural network|feed-forward]] [[neural network]] where the individual neurons are tiled in such a way that they respond to overlapping regions in the visual field.<ref name="deeplearning">{{cite web|title=Convolutional Neural Networks (LeNet) - DeepLearning 0.1 documentation|url=http://deeplearning.net/tutorial/lenet.html|work=DeepLearning 0.1|publisher=LISA Lab|accessdate=31 August 2013}}</ref> Convolutional networks were inspired by biological processes<ref name="robust face detection">{{cite journal|last=Matusugu|first=Masakazu|coauthors=Katsuhiko Mori, Yusuke Mitari, and Yuji Kaneda|title=Subject independent facial expression recognition with robust face detection using a convolutional neural network|journal=Neural Networks|year=2003|volume=16|issue=5|pages=555-559|url=http://www.iro.umontreal.ca/~pift6080/H09/documents/papers/sparse/matsugo_etal_face_expression_conv_nnet.pdf|accessdate=17 November 2013}}</ref> and are variations of [[multilayer perceptron|multilayer perceptrons]] which are designed to use minimal amounts of [[preprocessing]].<ref>{{cite web|last=LeCun|first=Yann|title=LeNet-5, convolutional neural networks|url=http://yann.lecun.com/exdb/lenet/|accessdate=16 November 2013}}</ref> They are widely used models for image-recognition.<ref name="vlsi">{{cite journal|last1=Korekado|first1=Keisuke|last2=Morie|first2=Takashi|last3=Nomura|first3=Osamu|last4=Ando|first4=Hiroshi|last5=Nakano|first5=Teppei|last6=Matsugu|first6=Masakazu|last7=Iwata|first7=Atsushi|title=A Convolutional Neural Network VLSI for Image Recognition Using Merged/Mixed Analog-Digital Architecture|journal=Knowledge-Based Intelligent Information and Engineering Systems|year=2003|pages=169-176|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.125.3812&rep=rep1&type=pdf|accessdate=16 November 2013}}</ref>
+In computer science, a '''convolutional neural network''' is a type of [[Feedforward neural network|feed-forward]] [[neural network]] where the individual neurons are tiled in such a way that they respond to overlapping regions in the visual field.<ref name="deeplearning">{{cite web|title=Convolutional Neural Networks (LeNet) - DeepLearning 0.1 documentation|url=http://deeplearning.net/tutorial/lenet.html|work=DeepLearning 0.1|publisher=LISA Lab|accessdate=31 August 2013}}</ref> Convolutional networks were inspired by biological processes<ref name="robust face detection">{{cite journal|last=Matusugu|first=Masakazu|coauthors=Katsuhiko Mori, Yusuke Mitari, and Yuji Kaneda|title=Subject independent facial expression recognition with robust face detection using a convolutional neural network|journal=Neural Networks|year=2003|volume=16|issue=5|pages=555–559|url=http://www.iro.umontreal.ca/~pift6080/H09/documents/papers/sparse/matsugo_etal_face_expression_conv_nnet.pdf|accessdate=17 November 2013|doi=10.1016/S0893-6080(03)00115-1}}</ref> and are variations of [[multilayer perceptron|multilayer perceptrons]] which are designed to use minimal amounts of [[preprocessing]].<ref>{{cite web|last=LeCun|first=Yann|title=LeNet-5, convolutional neural networks|url=http://yann.lecun.com/exdb/lenet/|accessdate=16 November 2013}}</ref> They are widely used models for image-recognition.<ref name="vlsi">{{cite journal|last1=Korekado|first1=Keisuke|last2=Morie|first2=Takashi|last3=Nomura|first3=Osamu|last4=Ando|first4=Hiroshi|last5=Nakano|first5=Teppei|last6=Matsugu|first6=Masakazu|last7=Iwata|first7=Atsushi|title=A Convolutional Neural Network VLSI for Image Recognition Using Merged/Mixed Analog-Digital Architecture|journal=Knowledge-Based Intelligent Information and Engineering Systems|year=2003|pages=169–176|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.125.3812&rep=rep1&type=pdf|accessdate=16 November 2013}}</ref>
 == Architecture ==
-When used for image recognition, convolutional neural networks consist of multiple layers of small neuron collections which look at small portions of the input image. The results of these collections are then tiled so that they overlap to obtain a better representation of the original image; this is repeated for every such layer. Because of this, they are able to tolerate [[Translation (geometry)|translation]] of the input image.<ref name="vlsi" /> Most convolutional networks include local pooling or max pooling layers, which simplify and combine the outputs of neighboring neurons; essentially, if the outputs are integrated into an image, pooling layers reduce its resolution.<ref>{{cite web|last=Krizhevsky|first=Alex|title=ImageNet Classification with Deep Convolutional Neural Networks|url=http://www.image-net.org/challenges/LSVRC/2012/supervision.pdf|accessdate=17 November 2013}}</ref> They also consist of various combinations of [[Convolution|convolutional]] layers and fully connected layers, with [[pointwise nonlinearity]] applied at the end of or after each layer.<ref name="mcdns">{{cite journal |last1=Ciresan |first1=Dan |first2=Ueli |last2=Meier |first3=Jürgen |last3=Schmidhuber |title=Multi-column deep neural networks for image classification |journal=2012 [[IEEE Conference on Computer Vision and Pattern Recognition]] |year=2012 |month=June |pages=3642-3649 |url=http://ieeexplore.ieee.org.ezproxy.hpu.edu/xpl/articleDetails.jsp?arnumber=6248110 |doi=10.1109/CVPR.2012.6248110 |arxiv=1202.2745v1 |accessdate=2013-12-09 |isbn=9781467312264 |oclc=812295155 |publisher=[[Institute of Electrical and Electronics Engineers]] (IEEE) |location=New York, NY}}</ref>
+When used for image recognition, convolutional neural networks consist of multiple layers of small neuron collections which look at small portions of the input image. The results of these collections are then tiled so that they overlap to obtain a better representation of the original image; this is repeated for every such layer. Because of this, they are able to tolerate [[Translation (geometry)|translation]] of the input image.<ref name="vlsi" /> Most convolutional networks include local pooling or max pooling layers, which simplify and combine the outputs of neighboring neurons; essentially, if the outputs are integrated into an image, pooling layers reduce its resolution.<ref>{{cite web|last=Krizhevsky|first=Alex|title=ImageNet Classification with Deep Convolutional Neural Networks|url=http://www.image-net.org/challenges/LSVRC/2012/supervision.pdf|accessdate=17 November 2013}}</ref> They also consist of various combinations of [[Convolution|convolutional]] layers and fully connected layers, with [[pointwise nonlinearity]] applied at the end of or after each layer.<ref name="mcdns">{{cite journal |last1=Ciresan |first1=Dan |first2=Ueli |last2=Meier |first3=Jürgen |last3=Schmidhuber |title=Multi-column deep neural networks for image classification |journal=2012 [[IEEE Conference on Computer Vision and Pattern Recognition]] |year=2012 |month=June |pages=3642–3649 |url=http://ieeexplore.ieee.org.ezproxy.hpu.edu/xpl/articleDetails.jsp?arnumber=6248110 |doi=10.1109/CVPR.2012.6248110 |arxiv=1202.2745v1 |accessdate=2013-12-09 |isbn=9781467312264 |oclc=812295155 |publisher=[[Institute of Electrical and Electronics Engineers]] (IEEE) |location=New York, NY}}</ref>
-Some time-delay neural networks also use a very similar architecture to convolutional neural networks, especially those for image recognition and/or classification tasks, since the "tiling" of the neuron outputs can easily be carried out in timed stages in a manner useful for analysis of images.<ref name="video quality">{{cite journal|last=Le Callet|first=Patrick|coauthors=Christian Viard-Gaudin, and Dominique Barba|title=A Convolutional Neural Network Approach for Objective Video Quality Assessment|journal=IEEE Transactions on Neural Networks|year=2006|volume=17|issue=5|pages=1316-1327|url=http://hal.univ-nantes.fr/docs/00/28/74/26/PDF/A_convolutional_neural_network_approach_for_objective_video_quality_assessment_completefinal_manuscript.pdf|accessdate=17 November 2013}}</ref>
+Some time-delay neural networks also use a very similar architecture to convolutional neural networks, especially those for image recognition and/or classification tasks, since the "tiling" of the neuron outputs can easily be carried out in timed stages in a manner useful for analysis of images.<ref name="video quality">{{cite journal|last=Le Callet|first=Patrick|coauthors=Christian Viard-Gaudin, and Dominique Barba|title=A Convolutional Neural Network Approach for Objective Video Quality Assessment|journal=IEEE Transactions on Neural Networks|year=2006|volume=17|issue=5|pages=1316–1327|url=http://hal.univ-nantes.fr/docs/00/28/74/26/PDF/A_convolutional_neural_network_approach_for_objective_video_quality_assessment_completefinal_manuscript.pdf|accessdate=17 November 2013|doi=10.1109/TNN.2006.879766|pmid=17001990}}</ref>
 == History ==
-Convolutional neural networks were introduced in a 1980 paper by Kunihiko Fukushima.<ref name=mcdns /><ref name="intro">{{cite journal|last=Fukushima|first=Kunihiko|title=Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position|journal=Biological Cybernetics|year=1980|volume=36|issue=4|pages=193-202|url=http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf|accessdate=16 November 2013}}</ref> Their design was later improved in 1998 by Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner,<ref>{{cite journal|last=LeCun|first=Yann|coauthors=Léon Bottou, Yoshua Bengio, and Patrick Haffner|title=Gradient-based learning applied to document recognition|journal=Proceedings of the IEEE|year=1998|volume=86|issue=11|pages=2278-2324|url=http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf|accessdate=16 November 2013}}</ref> generalized in 2003 by Sven Behnke,<ref>S. Behnke. Hierarchical Neural Networks for Image Interpretation, volume 2766 of Lecture Notes in Computer Science. Springer, 2003.</ref> and simplified by Patrice Simard, David Steinkraus, and John C. Platt in the same year.<ref>Simard, Patrice, David Steinkraus, and John C. Platt. "Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis." In ICDAR, vol. 3, pp. 958-962. 2003.</ref> In 2011, they were refined by Dan Ciresan et al. and were implemented on a GPU with impressive performance results.<ref name="flexible">{{cite journal|last=Ciresan|first=Dan|coauthors=Ueli Meier, Jonathan Masci, Luca M. Gambardella, Jurgen Schmidhuber|title=Flexible, High Performance Convolutional Neural Networks for Image Classiﬁcation|journal=Proceedings of the Twenty-Second international joint conference on Artificial Intelligence-Volume Volume Two|year=2011|volume=2|pages=1237-1242|url=http://www.idsia.ch/~juergen/ijcai2011.pdf|accessdate=17 November 2013}}</ref> In 2012, Dan Ciresan et al. significantly improved upon the best performance in the literature for multiple image databases, including the [[MNIST database]], the NORB database, the HWDB1.0 dataset (Chinese characters), and the CIFAR10 dataset (a subset of the 80 million tiny objects database).<ref name="mcdns" />
+Convolutional neural networks were introduced in a 1980 paper by Kunihiko Fukushima.<ref name=mcdns /><ref name="intro">{{cite journal|last=Fukushima|first=Kunihiko|title=Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position|journal=Biological Cybernetics|year=1980|volume=36|issue=4|pages=193–202|url=http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf|accessdate=16 November 2013|doi=10.1007/BF00344251|pmid=7370364}}</ref> Their design was later improved in 1998 by Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner,<ref>{{cite journal|last=LeCun|first=Yann|coauthors=Léon Bottou, Yoshua Bengio, and Patrick Haffner|title=Gradient-based learning applied to document recognition|journal=Proceedings of the IEEE|year=1998|volume=86|issue=11|pages=2278–2324|url=http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf|accessdate=16 November 2013|doi=10.1109/5.726791}}</ref> generalized in 2003 by Sven Behnke,<ref>S. Behnke. Hierarchical Neural Networks for Image Interpretation, volume 2766 of Lecture Notes in Computer Science. Springer, 2003.</ref> and simplified by Patrice Simard, David Steinkraus, and John C. Platt in the same year.<ref>Simard, Patrice, David Steinkraus, and John C. Platt. "Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis." In ICDAR, vol. 3, pp. 958-962. 2003.</ref> In 2011, they were refined by Dan Ciresan et al. and were implemented on a GPU with impressive performance results.<ref name="flexible">{{cite journal|last=Ciresan|first=Dan|coauthors=Ueli Meier, Jonathan Masci, Luca M. Gambardella, Jurgen Schmidhuber|title=Flexible, High Performance Convolutional Neural Networks for Image Classiﬁcation|journal=Proceedings of the Twenty-Second international joint conference on Artificial Intelligence-Volume Volume Two|year=2011|volume=2|pages=1237–1242|url=http://www.idsia.ch/~juergen/ijcai2011.pdf|accessdate=17 November 2013}}</ref> In 2012, Dan Ciresan et al. significantly improved upon the best performance in the literature for multiple image databases, including the [[MNIST database]], the NORB database, the HWDB1.0 dataset (Chinese characters), and the CIFAR10 dataset (a subset of the 80 million tiny objects database).<ref name="mcdns" />
 == Usage ==
-Convolutional neural networks are often used in image recognition systems. When applied to hand tracking in a video stream and gesture recognition, they had almost perfect performance.<ref>{{cite journal|last=Nowlan|first=Steven J.|coauthors=John C. Platt|title=A convolutional neural network hand tracker|journal=Advances in Neural Information Processing Systems|year=1995|pages=901-908|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.9668&rep=rep1&type=pdf|accessdate=16 November 2013}}</ref> They have achieved performance double that of humans on the problem of recognizing [[traffic sign]]s and an error rate of 0.23 percent on the MNIST database, which is the lowest ever achieved on the database to date.<ref name="mcdns" /> Another paper on using convolutional neural networks for image classification reported that the learning process of convolutional neural networks was "surprisingly fast"; in the same paper, the best published results at the time were achieved in the MNIST database and the NORB database.<ref name="flexible" />
+Convolutional neural networks are often used in image recognition systems. When applied to hand tracking in a video stream and gesture recognition, they had almost perfect performance.<ref>{{cite journal|last=Nowlan|first=Steven J.|coauthors=John C. Platt|title=A convolutional neural network hand tracker|journal=Advances in Neural Information Processing Systems|year=1995|pages=901–908|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.9668&rep=rep1&type=pdf|accessdate=16 November 2013}}</ref> They have achieved performance double that of humans on the problem of recognizing [[traffic sign]]s and an error rate of 0.23 percent on the MNIST database, which is the lowest ever achieved on the database to date.<ref name="mcdns" /> Another paper on using convolutional neural networks for image classification reported that the learning process of convolutional neural networks was "surprisingly fast"; in the same paper, the best published results at the time were achieved in the MNIST database and the NORB database.<ref name="flexible" />
 They have also been confirmed to have lower error rates than both deep neural networks and regular neural networks for large-vocabulary voice recognition tasks.<ref>{{cite journal|last=Sainath|first=Tara N.|coauthors=Abdel-rahman Mohamed, Brian Kingsbury, Bhuvana Ramabhadran|title=Deep convolutional neural networks for LVCSR|journal=International Conference on Acoustics, Speech, and Signal Processing|year=2013|pages=8614–8618|url=http://iipl.tk/paper/ICASSP2013/pdfs/0008614.pdf|accessdate=31 August 2013}}</ref>
-When applied to facial recognition, they were able to contribute to a large decrease in error rate.<ref>{{cite journal|last=Lawrence|first=Steve|coauthors=C. Lee Giles, Ah Chung Tsoi, and Andrew D. Back|title=Face Recognition: A Convolutional Neural Network Approach|journal=Neural Networks, IEEE Transactions on|year=1997|volume=8|issue=1|pages=98-113|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.5813&rep=rep1&type=pdf|accessdate=16 November 2013}}</ref> In another paper, they were able to achieve a 97.6 percent recognition rate on "5,600 still images of more than 10 subjects".<ref name="robust face detection" /> Convolutional neural networks have been used to assess video quality in an objective way after being manually trained; the resulting system had a very low root mean square error.<ref name="video quality"/>
+When applied to facial recognition, they were able to contribute to a large decrease in error rate.<ref>{{cite journal|last=Lawrence|first=Steve|coauthors=C. Lee Giles, Ah Chung Tsoi, and Andrew D. Back|title=Face Recognition: A Convolutional Neural Network Approach|journal=Neural Networks, IEEE Transactions on|year=1997|volume=8|issue=1|pages=98–113|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.5813&rep=rep1&type=pdf|accessdate=16 November 2013|doi=10.1109/72.554195}}</ref> In another paper, they were able to achieve a 97.6 percent recognition rate on "5,600 still images of more than 10 subjects".<ref name="robust face detection" /> Convolutional neural networks have been used to assess video quality in an objective way after being manually trained; the resulting system had a very low root mean square error.<ref name="video quality"/>
 == See also ==