Deep belief network

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Schematic overview of a deep belief net. Arrows represent directed connections in the graphical model that the net represents.

In machine learning, a deep belief network (DBN) is a generative graphical model, or alternatively a type of deep neural network, composed of multiple layers of latent variables ("hidden units"), with connections between the layers but not between units within each layer.[1]

When trained on a set of examples in an unsupervised way, a DBN can learn to probabilistically reconstruct its inputs. The layers then act as feature detectors on inputs.[1] After this learning step, a DBN can be further trained in a supervised way to perform classification.[2]

DBNs can be viewed as a composition of simple, unsupervised networks such as restricted Boltzmann machines (RBMs)[1] or autoencoders,[3] where each sub-network's hidden layer serves as the visible layer for the next. This also leads to a fast, layer-by-layer unsupervised training procedure, where contrastive divergence is applied to each sub-network in turn, starting from the "lowest" pair of layers (the lowest visible layer being a training set).

The observation, due to Yee-Whye Teh,[2] that DBNs can be trained greedily, one layer at a time, led to one of the first effective deep learning algorithms.[4]:6

Training algorithm[edit]

The training algorithm for DBNs proceeds as follows.[2] Let X be a matrix of inputs, regarded as a set of feature vectors.

  1. Train a restricted Boltzmann machine on X to obtain its weight matrix, W. Use this as the weight matrix between the lower two layers of the network.
  2. Transform X by the RBM to produce new data X', either by sampling or by computing the mean activation of the hidden units.
  3. Repeat this procedure with XX' for the next pair of layers, until the top two layers of the network are reached.
  4. Fine-tune all the parameters of this deep architecture with respect to a proxy for the DBN log- likelihood, or with respect to a supervised training criterion (after adding extra learning machinery to convert the learned representation into supervised predictions, e.g. a linear classifier).

See also[edit]

References[edit]

  1. ^ a b c Hinton, G. (2009). "Deep belief networks". Scholarpedia. 4 (5): 5947. doi:10.4249/scholarpedia.5947. 
  2. ^ a b c Hinton, G. E.; Osindero, S.; Teh, Y. W. (2006). "A Fast Learning Algorithm for Deep Belief Nets" (PDF). Neural Computation. 18 (7): 1527–1554. PMID 16764513. doi:10.1162/neco.2006.18.7.1527. 
  3. ^ Yoshua Bengio; Pascal Lamblin; Dan Popovici; Hugh Larochelle (2007). Greedy Layer-Wise Training of Deep Networks (PDF). NIPS. 
  4. ^ Bengio, Y. (2009). "Learning Deep Architectures for AI" (PDF). Foundations and Trends in Machine Learning. 2. doi:10.1561/2200000006. 

External links[edit]