Learning classifier system: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Docurbs (talk | contribs)
Added content for 'the revolution' section of history.
Docurbs (talk | contribs)
m Complete the revolution section within history
Line 32: Line 32:
Interest in learning classifier systems was reinvigorated in the mid 1990's largely due to two events; the development of the [[Q-learning|Q-Learning]] algorithm<ref>Watkins, Christopher John Cornish Hellaby. "Learning from delayed rewards." PhD diss., University of Cambridge, 1989.</ref> for [[reinforcement learning]], and the introduction of significantly simplified Michigan-style LCS architectures by Stewart Wilson<ref name=":6">{{Cite journal|last=Wilson|first=Stewart W.|date=1994-03-01|title=ZCS: A Zeroth Level Classifier System|url=http://dx.doi.org/10.1162/evco.1994.2.1.1|journal=Evolutionary Computation|volume=2|issue=1|pages=1–18|doi=10.1162/evco.1994.2.1.1|issn=1063-6560}}</ref><ref>{{Cite journal|last=Wilson|first=Stewart W.|date=1995-06-01|title=Classifier Fitness Based on Accuracy|url=http://dx.doi.org/10.1162/evco.1995.3.2.149|journal=Evol. Comput.|volume=3|issue=2|pages=149–175|doi=10.1162/evco.1995.3.2.149|issn=1063-6560}}</ref>. Wilson's '''Zeroth-level Classifier System (ZCS)'''<ref name=":6" /> focused on increasing algorithmic understandability based on Hollands standard LCS implementation<ref name=":4" />. This was done, in part, by removing rule-bidding and the internal message list, essential to the original BBA credit assignment, and replacing it with a hybrid BBA/[[Q-learning|Q-Learning]] strategy. ZCS demonstrated that a much simpler LCS architecture could perform as well as the original, more complex implementations. However, ZCS still suffered from performance drawbacks including the proliferation of over-general classifiers.
Interest in learning classifier systems was reinvigorated in the mid 1990's largely due to two events; the development of the [[Q-learning|Q-Learning]] algorithm<ref>Watkins, Christopher John Cornish Hellaby. "Learning from delayed rewards." PhD diss., University of Cambridge, 1989.</ref> for [[reinforcement learning]], and the introduction of significantly simplified Michigan-style LCS architectures by Stewart Wilson<ref name=":6">{{Cite journal|last=Wilson|first=Stewart W.|date=1994-03-01|title=ZCS: A Zeroth Level Classifier System|url=http://dx.doi.org/10.1162/evco.1994.2.1.1|journal=Evolutionary Computation|volume=2|issue=1|pages=1–18|doi=10.1162/evco.1994.2.1.1|issn=1063-6560}}</ref><ref>{{Cite journal|last=Wilson|first=Stewart W.|date=1995-06-01|title=Classifier Fitness Based on Accuracy|url=http://dx.doi.org/10.1162/evco.1995.3.2.149|journal=Evol. Comput.|volume=3|issue=2|pages=149–175|doi=10.1162/evco.1995.3.2.149|issn=1063-6560}}</ref>. Wilson's '''Zeroth-level Classifier System (ZCS)'''<ref name=":6" /> focused on increasing algorithmic understandability based on Hollands standard LCS implementation<ref name=":4" />. This was done, in part, by removing rule-bidding and the internal message list, essential to the original BBA credit assignment, and replacing it with a hybrid BBA/[[Q-learning|Q-Learning]] strategy. ZCS demonstrated that a much simpler LCS architecture could perform as well as the original, more complex implementations. However, ZCS still suffered from performance drawbacks including the proliferation of over-general classifiers.


In 1995, Wilson published his landmark paper, '"Classifier fitness based on accuracy" wherein he introduced an '''eXtended Classifier System (XCS)'''. XCS took the simplified architecture of ZCS and added an accuracy-based fitness, a niche GA (acting in the action set [A]), an explicit generalization mechanism called ''subsumption'', and an adaptation of the [[Q-learning|Q-Learning]] credit assignment. XCS was popularized by it's ability to reach optimal performance while evolving accurate and maximally general classifiers as well as it's impressive problem flexibility (able to perform both [[reinforcement learning]] and [[supervised learning]]) . XCS later became the best known and most studied LCS algorithm and defined a new family of ''accuracy-based LCS''. ZCS alternatively became synonymous with ''strength-based LCS''.
In 1995, Wilson published his landmark paper, '"Classifier fitness based on accuracy" wherein he introduced an '''eXtended Classifier System (XCS)'''. XCS took the simplified architecture of ZCS and added an accuracy-based fitness, a niche GA (acting in the action set [A]), an explicit generalization mechanism called ''subsumption'', and an adaptation of the [[Q-learning|Q-Learning]] credit assignment. XCS was popularized by it's ability to reach optimal performance while evolving accurate and maximally general classifiers as well as it's impressive problem flexibility (able to perform both [[reinforcement learning]] and [[supervised learning]]) . XCS later became the best known and most studied LCS algorithm and defined a new family of ''accuracy-based LCS''. ZCS alternatively became synonymous with ''strength-based LCS''. XCS is also important, because it successfully bridged the gap between LCS and the field of [[reinforcement learning]]. Following the success of XCS, LCS were later described as reinforcement learning systems endowed with a generalization capability<ref>{{Cite journal|last=Lanzi|first=P. L.|title=Learning classifier systems from a reinforcement learning perspective|url=http://link.springer.com/article/10.1007/s005000100113|journal=Soft Computing|language=en|volume=6|issue=3-4|pages=162–170|doi=10.1007/s005000100113|issn=1432-7643}}</ref>. [[Reinforcement learning]] typically seeks to learn a value function that maps out a complete representation of the state/action space. Similarly, the design of XCS drives it to form an all-inclusive and accurate representation of the problem space (i.e. a ''complete map'') rather than focusing on high payoff niches in the environment (as was the case with strength-based LCS). Conceptually, complete maps don't only capture what you should do, or what is correct, but also what you shouldn't do, or what's incorrect. Differently, most strength-based LCSs, or exclusively supervised learning LCSs seek a rule set of efficient generalizations in the form of a ''best action map'' (or a ''partial map''). Comparisons between strength vs. accuracy-based fitness and complete vs. best action maps have since been examined in greater detail<ref>Kovacs, Timothy Michael Douglas. ''A Comparison of Strength and Accuracy-based Fitness in Learning and Classifier Systems''. 2002.</ref><ref>[http://link.springer.com/chapter/10.1007/3-540-48104-4_6 Kovacs, Tim. "Two views of classifier systems." In ''International Workshop on Learning Classifier Systems'', pp. 74-87. Springer Berlin Heidelberg, 2001]</ref>.


==Techniques==
==Techniques==

Revision as of 21:45, 18 October 2016

Visualization of LCS rules learning to approximate a 3D function. Each blue ellipse represents an individual rule covering part of the solution space. (Adapted from images taken from XCSF[1] with permission from Martin Butz)

Learning classifier systems, or LCS, are a family of rule-based machine learning methods that combine a discovery component (e.g. typically a genetic algorithm) with a learning component (performing either supervised learning, reinforcement learning, or unsupervised learning).[2] Learning classifier systems seek to identify a set of context-dependent rules that collectively store and apply knowledge in a piecewise manner in order to make predictions (e.g. behavior modeling,[3] classification,[4][5] data mining,[5][6][7] regression,[8] function approximation,[9] or game strategy). This approach allows complex solution spaces to be broken up into smaller, simpler parts.

The founding concepts behind learning classifier systems came from attempts to model complex adaptive systems, using rule-based agents to form an artificial cognitive system (i.e. artificial intelligence).

History

Early Years

John Henry Holland was best known for his work popularizing genetic algorithms (GA), through his ground-breaking book "Adaptation in Natural and Artificial Systems"[10] in 1975 and his formalization of Holland's schema theorem. In 1976, Dr. Holland conceptualized an extension of the GA concept to what he called a "cognitive system"[11], and provided the first detailed description of what would be come known as the first learning classifier system in the paper "Cognitive Systems based on Adaptive Algorithms"[12]. This first system, named Cognitive System One (CS-1) was conceived as a modeling tool, designed to model a real system (i.e. environment) with unknown underlying dynamics using a population of human readable rules. The goal was for a set of rules to perform online machine learning to adapt to the environment based on infrequent payoff/reward (i.e. reinforcement learning) and apply these rules to generate a behavior that matched the real system. This early, ambitious implementation was later regarded as overly complex, yielding inconsistent results[2][13].

Beginning in 1980, Kenneth De Jong and his student Stephen Smith took a different approach to rule-based machine learning with (LS-1), where learning was viewed as an offline optimization process rather than an online adaptation process[14][15][16]. This new approach was more similar to a standard genetic algorithm but evolved independent sets of rules. Since that time LCS methods inspired by the online learning framework introduced by Holland at the University of Michigan have been referred to as Michigan-style LCS, and those inspired by Smith and De Jong at the University of Pittsburgh have been referred to as Pittsburgh-style LCS[2][13]. In 1986, Holland developed what would be considered the standard Michigan-style LCS for the next decade[17].

Other important concepts that emerged in the early days of LCS research included (1) the formalization of a bucket brigade algorithm (BBA) for credit assignment/learning[18], (2) selection of parent rules from a common 'environmental niche' (i.e. the match set [M]) rather than from the whole population [P][19], (3) covering, first introduced as a create operator[20], (4) the formalization of an action set [A][20], (5) a simplified algorithm architecture[20], (6) strength-based fitness[17], (7) consideration of single-step, or supervised learning problems[21] and the introduction of the correct set [C][22], (8) accuracy-based fitness[23] (9) the combination of fuzzy logic with LCS[24] (which later spawned a lineage of fuzzy LCS algorithms), (10) encouraging long action chains and default hierarchies for improving performance on multi-step problems[25][26][27], (11) examining latent learning (which later inspired a new branch of anticipatory classifier systems (ACS)), and (12) the introduction of the first Q-learning-like credit assignment technique[28]. While not all of these concepts are applied in modern LCS algorithms, each were landmarks in the development of the LCS paradigm.

The Revolution

Interest in learning classifier systems was reinvigorated in the mid 1990's largely due to two events; the development of the Q-Learning algorithm[29] for reinforcement learning, and the introduction of significantly simplified Michigan-style LCS architectures by Stewart Wilson[30][31]. Wilson's Zeroth-level Classifier System (ZCS)[30] focused on increasing algorithmic understandability based on Hollands standard LCS implementation[17]. This was done, in part, by removing rule-bidding and the internal message list, essential to the original BBA credit assignment, and replacing it with a hybrid BBA/Q-Learning strategy. ZCS demonstrated that a much simpler LCS architecture could perform as well as the original, more complex implementations. However, ZCS still suffered from performance drawbacks including the proliferation of over-general classifiers.

In 1995, Wilson published his landmark paper, '"Classifier fitness based on accuracy" wherein he introduced an eXtended Classifier System (XCS). XCS took the simplified architecture of ZCS and added an accuracy-based fitness, a niche GA (acting in the action set [A]), an explicit generalization mechanism called subsumption, and an adaptation of the Q-Learning credit assignment. XCS was popularized by it's ability to reach optimal performance while evolving accurate and maximally general classifiers as well as it's impressive problem flexibility (able to perform both reinforcement learning and supervised learning) . XCS later became the best known and most studied LCS algorithm and defined a new family of accuracy-based LCS. ZCS alternatively became synonymous with strength-based LCS. XCS is also important, because it successfully bridged the gap between LCS and the field of reinforcement learning. Following the success of XCS, LCS were later described as reinforcement learning systems endowed with a generalization capability[32]. Reinforcement learning typically seeks to learn a value function that maps out a complete representation of the state/action space. Similarly, the design of XCS drives it to form an all-inclusive and accurate representation of the problem space (i.e. a complete map) rather than focusing on high payoff niches in the environment (as was the case with strength-based LCS). Conceptually, complete maps don't only capture what you should do, or what is correct, but also what you shouldn't do, or what's incorrect. Differently, most strength-based LCSs, or exclusively supervised learning LCSs seek a rule set of efficient generalizations in the form of a best action map (or a partial map). Comparisons between strength vs. accuracy-based fitness and complete vs. best action maps have since been examined in greater detail[33][34].

Techniques

Learning classifier systems can be split into two types depending upon where the genetic algorithm acts. A Pittsburgh-type LCS has a population of separate rule sets, where the genetic algorithm recombines and reproduces the best of these rule sets. In a Michigan-style LCS there is only a single set of rules in a population and the algorithm's action focuses on selecting the best classifiers within that set. Michigan-style LCSs have two main types of fitness definitions: strength-based (e.g. ZCS) and accuracy-based (e.g. XCS). The term "learning classifier system" most often refers to Michigan-style LCSs.

Initially the classifiers or rules were binary, but recent research has expanded this representation to include real-valued, neural network, and functional (S-expression) conditions.[citation needed]

Learning classifier systems are not fully understood remains an area of active research.[citation needed] Despite this, they have been successfully applied in many problem domains.

Terminology

The name, 'Learning Classifier System (LCS)', is a bit misleading since there are many machine learning algorithms that 'learn to classify' (e.g. decision trees, artificial neural networks), but are not LCSs. The term 'rule-based machine learning (RBML)' is useful, as it more clearly captures the essential 'rule-based' component of these systems, but it also generalizes to methods that are not considered to be LCSs (e.g. association rule learning, or artificial immune systems). More general terms such as, 'genetics-based machine learning', and even 'genetic algorithm'[35] have also been applied to refer to what would be more characteristically defined as a learning classifier system. Due to their similarity to genetic algorithms, Pittsburgh-style learning classifier systems are sometimes generically referred to as 'genetic algorithms'. Beyond this, some LCS algorithms, or closely related methods, have been referred to as 'cognitive systems'[12], 'adaptive agents', 'production systems', or generically as a 'classifier system'[36][37]. This variation in terminology contributes to some confusion in the field.

Up until the 2000's nearly all learning classifier system methods were developed with reinforcement learning problems in mind. As a result, the term ‘learning classifier system’ was commonly defined as the combination of ‘trial-and-error’ reinforcement learning with the global search of a genetic algorithm. Interest in supervised learning applications, and even unsupervised learning have since broadened the use and definition of this term.

See also

References

  1. ^ Stalph, Patrick O.; Butz, Martin V. (2010-02-01). "JavaXCSF: The XCSF Learning Classifier System in Java". SIGEVOlution. 4 (3): 16–19. doi:10.1145/1731888.1731890. ISSN 1931-8499.
  2. ^ a b c Urbanowicz, Ryan J.; Moore, Jason H. (2009-09-22). "Learning Classifier Systems: A Complete Introduction, Review, and Roadmap". Journal of Artificial Evolution and Applications. 2009: 1–25. doi:10.1155/2009/736398. ISSN 1687-6229.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  3. ^ Dorigo, Marco. "Alecsys and the AutonoMouse: Learning to control a real robot by distributed classifier systems". Machine Learning. 19 (3): 209–240. doi:10.1007/BF00996270. ISSN 0885-6125.
  4. ^ Bernadó-Mansilla, Ester; Garrell-Guiu, Josep M. (2003-09-01). "Accuracy-Based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks". Evolutionary Computation. 11 (3): 209–238. doi:10.1162/106365603322365289. ISSN 1063-6560.
  5. ^ a b Urbanowicz, Ryan J.; Moore, Jason H. (2015-04-03). "ExSTraCS 2.0: description and evaluation of a scalable learning classifier system". Evolutionary Intelligence. 8 (2–3): 89–116. doi:10.1007/s12065-015-0128-8. ISSN 1864-5909. PMC 4583133. PMID 26417393.
  6. ^ Bernadó, Ester; Llorà, Xavier; Garrell, Josep M. (2001-07-07). Lanzi, Pier Luca; Stolzmann, Wolfgang; Wilson, Stewart W. (eds.). Advances in Learning Classifier Systems. Lecture Notes in Computer Science. Springer Berlin Heidelberg. pp. 115–132. doi:10.1007/3-540-48104-4_8. ISBN 9783540437932.
  7. ^ Bacardit, Jaume; Butz, Martin V. (2007-01-01). Kovacs, Tim; Llorà, Xavier; Takadama, Keiki; Lanzi, Pier Luca; Stolzmann, Wolfgang; Wilson, Stewart W. (eds.). Learning Classifier Systems. Lecture Notes in Computer Science. Springer Berlin Heidelberg. pp. 282–290. doi:10.1007/978-3-540-71231-2_19. ISBN 9783540712305.
  8. ^ Urbanowicz, Ryan; Ramanand, Niranjan; Moore, Jason (2015-01-01). "Continuous Endpoint Data Mining with ExSTraCS: A Supervised Learning Classifier System". Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation. GECCO Companion '15. New York, NY, USA: ACM: 1029–1036. doi:10.1145/2739482.2768453. ISBN 9781450334884.
  9. ^ Butz, M. V.; Lanzi, P. L.; Wilson, S. W. (2008-06-01). "Function Approximation With XCS: Hyperellipsoidal Conditions, Recursive Least Squares, and Compaction". IEEE Transactions on Evolutionary Computation. 12 (3): 355–376. doi:10.1109/TEVC.2007.903551. ISSN 1089-778X.
  10. ^ Holland, John (1975). Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. Michigan Press.
  11. ^ Holland JH (1976) Adaptation. In: Rosen R, Snell F (eds) Progress in theoretical biology, vol 4. Academic Press, New York, pp 263–293
  12. ^ a b Holland JH, Reitman JS (1978) Cognitive systems based on adaptive algorithms Reprinted in: Evolutionary computation. The fossil record. In: David BF (ed) IEEE Press, New York 1998. ISBN:0-7803-3481-7
  13. ^ a b Lanzi, Pier Luca (2008-02-08). "Learning classifier systems: then and now". Evolutionary Intelligence. 1 (1): 63–82. doi:10.1007/s12065-007-0003-3. ISSN 1864-5909.
  14. ^ Smith S (1980) A learning system based on genetic adaptive algorithms. Ph.D. thesis, Department of Computer Science, University of Pittsburgh
  15. ^ Smith S (1983) Flexible learning of problem solving heuristics through adaptive search. In: Eighth international joint conference on articial intelligence. Morgan Kaufmann, Los Altos, pp 421–425
  16. ^ De Jong KA (1988) Learning with genetic algorithms: an overview. Mach Learn 3:121–138
  17. ^ a b c Holland, John H. "Escaping brittleness: the possibilities of general purpose learning algorithms applied to parallel rule-based system." Machine learning(1986): 593-623.
  18. ^ Holland, John H. (1985-01-01). "Properties of the Bucket Brigade". Proceedings of the 1st International Conference on Genetic Algorithms. Hillsdale, NJ, USA: L. Erlbaum Associates Inc.: 1–7. ISBN 0805804269.
  19. ^ Booker, L (1982-01-01). Intelligent Behavior as a Adaptation to the Task Environment (Thesis). University of Michigan.
  20. ^ a b c Wilson, S. W. "Knowledge growth in an artificial animal. Proceedings of the First International Conference on Genetic Algorithms and their Applications." (1985).
  21. ^ Wilson, Stewart W. "Classifier systems and the animat problem". Machine Learning. 2 (3): 199–228. doi:10.1007/BF00058679. ISSN 0885-6125.
  22. ^ Bonelli, Pierre; Parodi, Alexandre; Sen, Sandip; Wilson, Stewart (1990-01-01). "NEWBOOLE: A Fast GBML System". Proceedings of the Seventh International Conference (1990) on Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.: 153–159. ISBN 1558601414.
  23. ^ Frey, Peter W.; Slate, David J. "Letter recognition using Holland-style adaptive classifiers". Machine Learning. 6 (2): 161–182. doi:10.1007/BF00114162. ISSN 0885-6125.
  24. ^ Valenzuela-Rendón, Manuel. "The Fuzzy Classifier System: A Classifier System for Continuously Varying Variables." In ICGA, pp. 346-353. 1991.
  25. ^ Riolo, Rick L. (1988-01-01). Empirical Studies of Default Hierarchies and Sequences of Rules in Learning Classifier Systems (Thesis). Ann Arbor, MI, USA: University of Michigan.
  26. ^ R.L., Riolo, (1987-01-01). "Bucket brigade performance. I. Long sequences of classifiers". Genetic algorithms and their applications : proceedings of the second International Conference on Genetic Algorithms : July 28-31, 1987 at the Massachusetts Institute of Technology, Cambridge, MA.{{cite journal}}: CS1 maint: extra punctuation (link) CS1 maint: multiple names: authors list (link)
  27. ^ R.L., Riolo, (1987-01-01). "Bucket brigade performance. II. Default hierarchies". Genetic algorithms and their applications : proceedings of the second International Conference on Genetic Algorithms : July 28-31, 1987 at the Massachusetts Institute of Technology, Cambridge, MA.{{cite journal}}: CS1 maint: extra punctuation (link) CS1 maint: multiple names: authors list (link)
  28. ^ Riolo, Rick L. (1990-01-01). "Lookahead Planning and Latent Learning in a Classifier System". Proceedings of the First International Conference on Simulation of Adaptive Behavior on From Animals to Animats. Cambridge, MA, USA: MIT Press: 316–326. ISBN 0262631385.
  29. ^ Watkins, Christopher John Cornish Hellaby. "Learning from delayed rewards." PhD diss., University of Cambridge, 1989.
  30. ^ a b Wilson, Stewart W. (1994-03-01). "ZCS: A Zeroth Level Classifier System". Evolutionary Computation. 2 (1): 1–18. doi:10.1162/evco.1994.2.1.1. ISSN 1063-6560.
  31. ^ Wilson, Stewart W. (1995-06-01). "Classifier Fitness Based on Accuracy". Evol. Comput. 3 (2): 149–175. doi:10.1162/evco.1995.3.2.149. ISSN 1063-6560.
  32. ^ Lanzi, P. L. "Learning classifier systems from a reinforcement learning perspective". Soft Computing. 6 (3–4): 162–170. doi:10.1007/s005000100113. ISSN 1432-7643.
  33. ^ Kovacs, Timothy Michael Douglas. A Comparison of Strength and Accuracy-based Fitness in Learning and Classifier Systems. 2002.
  34. ^ Kovacs, Tim. "Two views of classifier systems." In International Workshop on Learning Classifier Systems, pp. 74-87. Springer Berlin Heidelberg, 2001
  35. ^ Congdon, Clare Bates. "A comparison of genetic algorithms and other machine learning systems on a complex classification task from common disease research." PhD diss., The University of Michigan, 1995.
  36. ^ Booker, L. B.; Goldberg, D. E.; Holland, J. H. (1989-09-01). "Classifier systems and genetic algorithms". Artificial Intelligence. 40 (1): 235–282. doi:10.1016/0004-3702(89)90050-7.
  37. ^ Wilson, Stewart W., and David E. Goldberg. "A critical review of classifier systems." In Proceedings of the third international conference on Genetic algorithms, pp. 244-255. Morgan Kaufmann Publishers Inc., 1989.

External links

Videos

Suggested Review Papers

  • Urbanowicz, Ryan J.; Moore, Jason H. (January 2009), "Learning Classifier Systems: A Complete Introduction, Review, and Roadmap", J. Artif. Evol. App., 2009, New York, NY, United States: Hindawi Publishing Corp.: 1:1–1:25, doi:10.1155/2009/736398{{citation}}: CS1 maint: unflagged free DOI (link).

Webpages