FORR (FOr the Right Reasons) is a cognitive architecture for learning and problem solving inspired by Herbert A. Simon's ideas of bounded rationality and satisficing. It was first developed in the early 1990s at the City University of New York. It has been used in game playing, robot pathfinding, recreational park design, spoken dialog systems, and solving NP-hard constraint satisfaction problems, and is general enough for many problem solving applications.
FORR does not have perfect knowledge of how to solve a problem, but instead learns from experience. Intelligent agents are not optimal, but make decisions based on only a subset of all possible good reasons and informative data. These agents can still be considered rational. This idea of bounded rationality was introduced by Herbert A. Simon, who along with Allen Newell developed the early foundations of the study of cognitive architectures and also inspired early architectures such as Soar and ACT-R.
Multiple good reasons
FORR depends upon the idea that there are multiple reasons or rationales for performing actions while solving a problem. These reasons can be always right (it's always right to make a move in chess that will put the opponent in checkmate) or just sometimes right. The always-right reasons are the minority. The sometimes-right reasons can complete with each other: for example, in game playing, one good reason might be to capture pieces, while another might be to control some area of the board. In FORR, these competing reasons are called Advisors.
The tiered Advisor system is general enough that any potential good reason, such as probabilistic, deductive, or perceptual can be implemented, so long as it gives advice on its preference of one action over another.
Because of its reliance on a set of independent agents (the Advisors), FORR can be considered a connectionist architecture.
A FORR architecture has three components: a set of descriptives that describe the state of the problem, a tiered set of Advisors that are consulted in order to decide what action to perform, and a behavioral script that queries the Advisors and performs the action that they suggest.
The Advisors are the set of rationales or heuristics for making a decision. They can be considered the procedural memory component of the architecture. Upon each new decision, Advisors are queried in order to decide which action to perform. Advisors never communicate with each other or learn on their own: they simply ask for information about the state of the problem stored in the form of descriptives, and make a suggestion based on that information. The Advisors are divided into three tiers, which are queried in the following order:
- Tier 1: these Advisors are always right. If these suggest an action, that action is carried out immediately and the query ends. If they forbid an action, that action is removed from consideration. Otherwise, move to the next tier.
- Tier 2: if one of these Advisors is triggered, it proposes a sub-problem, or an ordered set of actions, achieving a sub-goal in solving the overall problem (such as moving around one obstacle in a maze). If no tier 2 advisor is triggered, move to last tier.
- Tier 3: these are all other rationales. They are not always right, but compete with each other. They vote on an action, and the highest-voted suggestion is performed. Different problem classes in the same domain will have different weights for the same Advisors, and the weights are developed from experience through learning algorithms.
The declarative memory component of the architecture, the descriptives represent the state of the problem and are available to any Advisor.
The behavioral script queries each tier of Advisors sequentially. If a tier 1 Advisor suggests an action, the script performs the action. Otherwise, if a tier 2 Advisor is triggered, it means that a sub-problem has been encountered. A tier 1 Advisor guarantees that only one tier 2 Advisor is active at any time. If no tier 1 Advisor comments and no tier 2 Advisor is triggered, the behavioral script asks for suggestions or comments from all tier 3 Advisors and lets them vote. The script performs the action with the highest vote among all tier 3 advisors.
Implementing a FORR architecture
A problem domain is a set of similar problems, called the problem classes. If the problem domain is playing simple board games, then tic-tac-toe is a problem class, and one particular game of tic-tac-toe is a problem instance. If navigating a maze is the problem domain, then a particular maze is the class and one attempt at its navigation is an instance. Once the problem domain is identified, the implementation of a FORR architecture for that domain has two basic stages: finding possible right reasons (the Advisors) and learning their weights for a particular class.
How to build a FORR architecture
- Decide on a problem domain.
- Use domain knowledge, surveys of the literature, intuition and good sense to enumerate a list of possible rationales for making a decision, which can be good or bad for different classes within the domain. These rationales are the Advisors.
- Divide the Advisors into tiers:
- The Advisors that are always right are in Tier 1. For example, it's always right to make a winning move in a board game.
- The Advisors which identify a sub-problem go into Tier 2. For example, going around a wall in a maze.
- Every other Advisor is Tier 3.
- Code the Advisors. Each Advisor returns a set of suggested actions along with weights for each suggested action. The weights are initially set to a uniform value, such as 0.05.
- Identify all information about the state of the problem needed by all Advisors. These are the descriptives. Code these.
- Code the behavioral script which queries the Advisors and performs the action they suggest.
- Learn the weights for the Advisors on a set of particular problem instances in the Learning Phase using a Reinforcement learning algorithm.
- Test the architecture on a set of previously unencountered problem instances.
Learning Advisor weights
The Advisors are the same for all problem classes in a domain, but the weights can be different for each class within the domain. Important heuristics for tic-tac-toe might not be important for a different board game. FORR learns the weights for its tier 3 Advisors by experience. Advisors that suggest an action resulting in failure have their weights penalized, and Advisors whose suggestions result in success have their weights increased. Learning algorithms vary between implementations.
- Epstein, S. L. (1994) For the Right Reasons: The FORR Architecture for Learning in a Skill Domain
- Epstein, S. L. and Petrovic, S. (2008) Learning Expertise with Bounded Rationality and Self-awareness
- Langley, P., Laird, J. E., & Rogers, S. (2009) Cognitive architectures: Research issues and challenges