Meta learning (computer science)

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Meta learning[1][2] is a subfield of machine learning where automatic learning algorithms are applied on metadata about machine learning experiments. As of 2017 the term had not found a standard interpretation, however the main goal is to use such metadata to understand how automatic learning can become flexible in solving learning problems, hence to improve the performance of existing learning algorithms or to learn (induce) the learning algorithm itself, hence the alternative term learning to learn.[1]

Flexibility is important because each learning algorithm is based on a set of assumptions about the data, its inductive bias.[3] This means that it will only learn well if the bias matches the learning problem. A learning algorithm may perform very well in one domain, but not on the next. This poses strong restrictions on the use of machine learning or data mining techniques, since the relationship between the learning problem (often some kind of database) and the effectiveness of different learning algorithms is not yet understood.

By using different kinds of metadata, like properties of the learning problem, algorithm properties (like performance measures), or patterns previously derived from the data, it is possible to learn, select, alter or combine different learning algorithms to effectively solve a given learning problem. Critiques of meta learning approaches bear a strong resemblance to the critique of metaheuristic, a possibly related problem. A good analogy to meta-learning, and the inspiration for Jürgen Schmidhuber's early work (1987)[1] and Yoshua Bengio et al.'s work (1991),[4] considers that genetic evolution learns the learning procedure encoded in genes and executed in each individual's brain. In an open-ended hierarchical meta learning system[1] using Genetic Programming, better evolutionary methods can be learned by meta evolution, which itself can be improved by meta meta evolution, etc.[1]


A proposed definition[5] for a meta learning system combines three requirements:

  • The system must include a learning subsystem.
  • Experience is gained by exploiting meta knowledge extracted
    • in a previous learning episode on a single dataset, or
    • from different domains.
  • Learning bias must be chosen dynamically.

Bias refers to the assumptions that influence the choice of explanatory hypotheses[6] and not the notion of bias represented in the bias-variance dilemma. Meta learning is concerned with two aspects of learning bias.

  • Declarative bias specifies the representation of the space of hypotheses, and affects the size of the search space (e.g., represent hypotheses using linear functions only).
  • Procedural bias imposes constraints on the ordering of the inductive hypotheses (i.e. preferring smaller hypotheses).


Some approaches which have been viewed as instances of meta learning:

  • Recurrent neural networks (RNNs) are universal computers. In 1993, Jürgen Schmidhuber showed how "self-referential" RNNs can in principle learn by backpropagation to run their own weight change algorithm, which may be quite different from backpropagation.[7] In 2001, Sepp Hochreiter & A.S. Younger & P.R. Conwell built a successful supervised meta learner based on Long short-term memory RNNs. It learned through backpropagation a learning algorithm for quadratic functions that is much faster than backpropagation.[8][2] Researchers at Deepmind (Marcin Andrychowicz et al.) extended this approach to optimization in 2017.[9]
  • In the 1990s, Meta Reinforcement Learning or Meta RL was achieved in Schmidhuber's research group through self-modifying policies written in a universal programming language that contains special instructions for changing the policy itself. There is a single lifelong trial. The goal of the RL agent is to maximize reward. It learns to accelerate reward intake by continually improving its own learning algorithm which is part of the "self-referential" policy.[10][11]
  • An extreme type of Meta Reinforcement Learning is embodied by the Gödel machine, a theoretical construct which can inspect and modify any part of its own software which also contains a general theorem prover[disambiguation needed]. It can achieve recursive self-improvement in a provably optimal way.[12][2]
  • Model-Agnostic Meta-Learning (MAML) was introduced in 2017 by Chelsea Finn et al.[13] Given a sequence of tasks, the parameters of a given model are trained such that few iterations of gradient descent with few training data from a new task will lead to good generalization performance on that task. MAML "trains the model to be easy to fine-tune."[13] MAML was successfully applied to few-shot image classification benchmarks and to policy gradient-based reinforcement learning.[13]
  • Discovering meta-knowledge works by inducing knowledge (e.g. rules) that expresses how each learning method will perform on different learning problems. The metadata is formed by characteristics of the data (general, statistical, information-theoretic,... ) in the learning problem, and characteristics of the learning algorithm (type, parameter settings, performance measures,...). Another learning algorithm then learns how the data characteristics relate to the algorithm characteristics. Given a new learning problem, the data characteristics are measured, and the performance of different learning algorithms are predicted. Hence, one can predict the algorithms best suited for the new problem.
  • Stacked generalisation works by combining multiple (different) learning algorithms. The metadata is formed by the predictions of those different algorithms. Another learning algorithm learns from this metadata to predict which combinations of algorithms give generally good results. Given a new learning problem, the predictions of the selected set of algorithms are combined (e.g. by (weighted) voting) to provide the final prediction. Since each algorithm is deemed to work on a subset of problems, a combination is hoped to be more flexible and able to make good predictions.
  • Boosting is related to stacked generalisation, but uses the same algorithm multiple times, where the examples in the training data get different weights over each run. This yields different predictions, each focused on rightly predicting a subset of the data, and combining those predictions leads to better (but more expensive) results.
  • Dynamic bias selection works by altering the inductive bias of a learning algorithm to match the given problem. This is done by altering key aspects of the learning algorithm, such as the hypothesis representation, heuristic formulae, or parameters. Many different approaches exist.
  • Inductive transfer studies how the learning process can be improved over time. Metadata consists of knowledge about previous learning episodes and is used to efficiently develop an effective hypothesis for a new task. A related approach is called learning to learn, in which the goal is to use acquired knowledge from one domain to help learning in other domains.
  • Other approaches using metadata to improve automatic learning are learning classifier systems, case-based reasoning and constraint satisfaction.
  • Some initial, theoretical work has been initiated to use Applied Behavioral Analysis as a foundation for agent-mediated meta-learning about the performances of human learners, and adjust the instructional course of an artificial agent.[14]
  • AutoML such as Google Brain's "AI building AI" project, which according to Google briefly exceeded existing ImageNet benchmarks in 2017.[15][16]


  1. ^ a b c d e Schmidhuber, Jürgen (1987). "Evolutionary principles in self-referential learning, or on learning how to learn: the meta-meta-... hook" (PDF). Diploma Thesis, Tech. Univ. Munich.
  2. ^ a b c Schaul, Tom; Schmidhuber, Jürgen (2010). "Metalearning". Scholarpedia. 5 (6): 4650. doi:10.4249/scholarpedia.4650.
  3. ^ P. E. Utgoff (1986). "Shift of bias for inductive concept learning". In R. Michalski, J. Carbonell, & T. Mitchell: Machine Learning: 163–190.
  4. ^ Bengio, Yoshua; Bengio, Samy; Cloutier, Jocelyn (1991). Learning to learn a synaptic rule (PDF). IJCNN'91.
  5. ^ Lemke, Christiane; Budka, Marcin; Gabrys, Bogdan (2013-07-20). "Metalearning: a survey of trends and technologies". Artificial Intelligence Review. 44 (1): 117–130. doi:10.1007/s10462-013-9406-y. ISSN 0269-2821. PMC 4459543. PMID 26069389.
  6. ^ Brazdil, Pavel; Carrier, Christophe Giraud; Soares, Carlos; Vilalta, Ricardo (2009). Metalearning - Springer. Cognitive Technologies. doi:10.1007/978-3-540-73263-1. ISBN 978-3-540-73262-4.
  7. ^ Schmidhuber, Jürgen (1993). "A self-referential weight matrix". Proceedings of ICANN'93, Amsterdam: 446-451.
  8. ^ Hochreiter, Sepp; Younger, A. S.; Conwell, P. R. (2001). "Learning to Learn Using Gradient Descent". Proceedings of ICANN'01: 87-94.
  9. ^ Andrychowicz, Marcin; Denil, Misha; Gomez, Sergio; Hoffmann, Matthew; Pfau, David; Schaul, Tom; Shillingford, Brendan; de Freitas, Nando (2017). "Learning to learn by gradient descent by gradient descent". Proceedings of ICML'17, Sydney, Australia.
  10. ^ Schmidhuber, Jürgen (1994). "On learning how to learn learning strategies". Technical Report FKI-198-94, Tech. Univ. Munich.
  11. ^ Schmidhuber, Jürgen; Zhao, J.; Wiering, M. (1997). "Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement". Machine Learning. 28: 105-130.
  12. ^ Schmidhuber, Jürgen (2006). "Gödel machines: Fully Self-Referential Optimal Universal Self-Improvers". In B. Goertzel & C. Pennachin, eds.: Artificial General Intelligence: 199-226.
  13. ^ a b c Finn, Chelsea; Abbeel, Pieter; Levine, Sergey (2017). "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks". arXiv:1703.03400 [cs.LG].
  14. ^ Begoli, Edmon (May 2014). Procedural-Reasoning Architecture for Applied Behavior Analysis-based Instructions. Knoxville, Tennessee, USA: University of Tennessee, Knoxville. pp. 44–79. Retrieved 14 October 2017.
  15. ^ "Robots Are Now 'Creating New Robots,' Tech Reporter Says". 2018. Retrieved 29 March 2018.
  16. ^ "AutoML for large scale image classification and object detection". Google Research Blog. November 2017. Retrieved 29 March 2018.

External links[edit]