Bayesian game

From Wikipedia, the free encyclopedia
  (Redirected from Bayesian Nash equilibrium)
Jump to navigation Jump to search

In game theory, a Bayesian game is a game in which the players have incomplete information about the other players. For example, a player may not know the exact payoff functions of the other players, but instead have beliefs about these payoff functions. These beliefs are represented by a probability distribution over the possible payoff functions.

John C. Harsanyi describes a Bayesian game in the following way.[1] Each player in the game is associated with a set of types, with each type in the set corresponding to a possible payoff function for that player. In addition to the actual players in the game, there is a special player called Nature. Nature randomly chooses a type for each player according to a probability distribution across the players' type spaces. This probability distribution is known by all players (the "common prior assumption"). This modeling approach transforms games of incomplete information into games of imperfect information (in which the history of play within the game is not known to all players).

Incompleteness of information means that at least one player is unsure of the type (and therefore the payoff function) of another player. Such games are called Bayesian because players can update their beliefs according to Bayes' rule as play takes place in the game. That is, the belief a player holds about another player's type might change as the game proceeds and actions are observed.

Specification of games[edit]

The normal form representation of a non-Bayesian game with perfect information is a specification of the strategy spaces and payoff functions of players. A strategy for a player is a complete plan of action that covers every contingency of the game, even if that contingency can never arise. The strategy space of a player is thus the set of all strategies available to a player. A payoff function is a function from the set of strategy profiles to the set of payoffs (normally the set of real numbers), where a strategy profile is a vector specifying a strategy for every player.

In a Bayesian game, one has to specify strategy spaces, type spaces, payoff functions and beliefs for every player. A strategy for a player is a complete plan of action that covers every contingency that might arise for every type that player might be. A strategy must not only specify the actions of the player given the type that he is but must specify the actions that he would take if he were of another type. Strategy spaces are defined as above. A type space for a player is just the set of all possible types of that player. The beliefs of a player describe the uncertainty of that player about the types of the other players. Each belief is the probability of the other players having particular types, given the type of the player with that belief (i.e. the belief is ). A payoff function is a 2-place function of strategy profiles and types. If a player has payoff function and he has type t, the payoff he receives is , where is the strategy profile played in the game (i.e. the vector of strategies played).

One of the formal definitions of such a game looks like the following:

The game is defined as: , where

  1. is the set of players.
  2. is the set of states of nature. For instance, in a card game, it can be any order of the cards.
  3. is the set of actions for player . Let .
  4. is the set of types for player . Given the state, the type of player is given by the function . So, for each state of nature, the game will have different types of players.
  5. defines the available actions for player of some type in .
  6. is the payoff function for player . More formally, let , and .
  7. is the probability distribution over for player . That is, each player has a different probability distribution over the states of nature.

The pure strategy should satisfy for all . So the strategy for each player only depends on his type, since he may not have any knowledge about other players' types. And the expected payoff to player for such a strategy profile is .

Let be the set of pure strategies,

A Bayesian Equilibrium of the game is defined to be a (possibly mixed strategy) Nash equilibrium of the game . So for any finite game , Bayesian Equilibria always exist.

The definition of Bayesian games has been combined with stochastic games to allow for environment states (e.g. physical world states) and stochastic transitions between states.[2] The resulting "stochastic Bayesian game" model is solved via a recursive combination of the Bayesian Nash equilibrium (see below) and the Bellman optimality equation.


A signaling game is a Bayesian game in which the informed party (the “agent”) knows their type, whereas the uninformed party (the “principal”) does not know the agent’s type. In some such games, it is possible for the principal to deduce the agent's type based on the actions the agent takes (in the form of a signal sent to the principal) in what is known as a “separating equilibrium”.

A specific example of a signaling game is a model of the job market. The players are the applicant (agent) and the employer (principal). There are two types of applicant, skilled and unskilled. The employer does not know which the applicant is, but he does know that 90% of applicants are unskilled and 10% are skilled (type 'skilled' has a probability of 0.1 and type 'unskilled' has a 0.9 probability).

The employer's action space is the set of natural numbers, representing wages—these are used to form a contract based on how productive the applicant is expected to be. Paying larger wages to skilled workers will generate larger payoffs for employers, while wages given to unskilled workers will have a less pronounced effect. The payoff of the employer is determined thus by the skill of the applicant (if the applicant accepts a contract) and the wage paid. Crucially, the employer chooses his or her action (the wage offered) according to his or her belief as to how skilled the applicant is and this belief is largely determined through signals sent by the applicant.

The applicant's action space consists of two actions: either obtain a university education or abstain from the university. Obtaining an education is less costly for a skilled worker than for an unskilled worker, as a skilled worker may receive scholarships, find classes less taxing, and so on. University education, therefore, serves as a signal, a means by which the applicant may communicate to the employer that he or she is, in fact, skilled. Thus, it may be rational for an employer to prefer to employ university graduates, even if their studies are not related at all to the work they will do at the company.

One strategy the employer may use is to give all applicants a wage such that skilled applicants may attend university (due to its lower cost) but which is insufficient to provide university education for unskilled applicants. This creates a separating equilibrium: skilled applicants can now signify their skill by going to university, and unskilled applicants cannot. The employer can observe which workers can go to university, and can then maximize his or her payoff by providing high wages to skilled workers and low wages to unskilled.

Bayesian Nash equilibrium[edit]

In a non-Bayesian game, a strategy profile is a Nash equilibrium if every strategy in that profile is a best response to every other strategy in the profile; i.e., there is no strategy that a player could play that would yield a higher payoff, given all the strategies played by the other players. In a Bayesian game (where players are modeled as risk-neutral), rational players are seeking to maximize their expected payoff, given their beliefs about the other players (in the general case, where players may be risk-averse or risk-loving, the assumption is that players are expected utility-maximizing).

A Bayesian Nash equilibrium is defined as a strategy profile and beliefs specified for each player about the types of the other players that maximizes the expected payoff for each player given their beliefs about the other players' types and given the strategies played by the other players.

This solution concept yields an abundance of equilibria in dynamic games when no further restrictions are placed on players' beliefs. This makes Bayesian Nash equilibrium an incomplete tool with which to analyze dynamic games of incomplete information.

Perfect Bayesian equilibrium[edit]

Bayesian Nash equilibrium results in some implausible equilibria in dynamic games, where players take turns sequentially rather than simultaneously. Similarly, implausible equilibria might arise in the same way that implausible Nash equilibria arise in games of perfect and complete information, such as incredible threats and promises. Such equilibria might be eliminated in perfect and complete information games by applying subgame perfect Nash equilibrium. However, it is not always possible to avail oneself of this solution concept in incomplete information games because such games contain non-singleton information sets and since subgames must contain complete information sets, sometimes there is only one subgame—the entire game—and so every Nash equilibrium is trivially subgame perfect. Even if a game does have more than one subgame, the inability of subgame perfection to cut through information sets can result in implausible equilibria not being eliminated.

To refine the equilibria generated by the Bayesian Nash solution concept or subgame perfection, one can apply the Perfect Bayesian equilibrium solution concept. PBE is in the spirit of subgame perfection in that it demands that subsequent play be optimal. However, it places player beliefs on decision nodes that enables moves in non-singleton information sets to be dealt more satisfactorily.

So far in discussing Bayesian games, it has been assumed that information is perfect (or if imperfect, play is simultaneous). In examining dynamic games, however, it might be necessary to have the means to model imperfect information. PBE affords this means: players place beliefs on nodes occurring in their information sets, which means that the information set can be generated by nature (in the case of incomplete information) or by other players (in the case of imperfect information).

Belief systems[edit]

The beliefs held by players in Bayesian games can be approached more rigorously in PBE. A belief system is an assignment of probabilities to every node in the game such that the sum of probabilities in any information set is 1. The beliefs of a player are exactly those probabilities of the nodes in all the information sets at which that player has the move (a player belief might be specified as a function from the union of his information sets to [0,1]). A belief system is consistent for a given strategy profile if and only if the probability assigned by the system to every node is computed as the probability of that node being reached given the strategy profile, i.e. by Bayes' rule.

Sequential rationality[edit]

The notion of sequential rationality is what determines the optimality of subsequent play in PBE. A strategy profile is sequentially rational at particular information set for a particular belief system if and only if the expected payoff of the player whose information set it is (i.e. who has the move at that information set) is maximal given the strategies played by all the other players. A strategy profile is sequentially rational for a particular belief system if it satisfies the above for every information set.


A perfect Bayesian equilibrium is a strategy profile, and a belief system such that the strategies are sequentially rational given the belief system and the belief system is consistent, wherever possible, given the strategy profile.

It is necessary to stipulate the 'wherever possible' clause because some information sets might not be reached with the given strategy profile and hence Bayes' rule cannot be employed to calculate the probability at the nodes in those sets. Such information sets are said to be off the equilibrium path, and any beliefs can be assigned to them. Stronger notions of consistency further restrict the beliefs that can be assigned to off-equilibrium information sets to "reasonable" ones.


Sheriff's Dilemma[edit]

A sheriff faces an armed suspect. Both must simultaneously decide whether to shoot the other or not.

The suspect can either be of type "criminal" or type "civilian". The sheriff has only one type. The suspect knows its type and the Sheriff's type, but the Sheriff does not know the suspect's type. Thus, there is incomplete information (because the suspect has private information), making it a Bayesian game. There is a probability p that the suspect is a criminal, and a probability 1-p that the suspect is a civilian; both players are aware of this probability (common prior assumption, which allows us to convert this into a complete-information game with imperfect information).

The sheriff would rather defend himself and shoot if the suspect shoots, or not shoot if the suspect does not (even if the suspect is a criminal). The suspect would rather shoot if he is a criminal, even if the sheriff does not shoot, but would rather not shoot if he is a civilian, even if the sheriff shoots. Thus, the payoff matrix of this Normal-form game for both players depends on the type of the suspect. We assume that payoffs are given as follows:

Type = "Civilian" Sheriff's action
Shoot Not
Suspect's action Shoot -3, -1 -1, -2
Not -2, -1 0, 0
Type = "Criminal" Sheriff's action
Shoot Not
Suspect's action Shoot 0, 0 2, -2
Not -2, -1 -1,1

If both players are rational and both know that both players are rational and everything that is known by any player is known to be known by every player (i.e. player 1 knows player 2 knows that player 1 is rational and player 2 knows this, etc. ad infinitumcommon knowledge), play in the game will be as follows according to perfect Bayesian equilibrium:[3][4]

When the type is "civilian", the dominant strategy for the suspect is not to shoot, and when the type is "criminal", the dominant strategy for the suspect is to shoot; we can thus remove the alternative strictly dominated strategy. Given this, if the sheriff shoots, he will have a payoff of 0 with probability p and a payoff of -1 with probability 1-p, i.e. an expected payoff of p-1; if the sheriff does not shoot, he will have a payoff of -2 with probability p and a payoff of 0 with probability 1-p, i.e. an expected payoff of -2p. Thus, the Sheriff will always shoot if p-1 > -2p, i.e. when p > 1/3.

See also[edit]


  1. ^ Harsanyi, John C., 1967/1968. "Games with Incomplete Information Played by Bayesian Players, I-III." Management Science 14 (3): 159-183 (Part I), 14 (5): 320-334 (Part II), 14 (7): 486-502 (Part III).
  2. ^ Albrecht, Stefano; Crandall, Jacob; Ramamoorthy, Subramanian (2016). "Belief and Truth in Hypothesised Behaviours". Artificial Intelligence. 235: 63–94. arXiv:1507.07688. doi:10.1016/j.artint.2016.02.004.
  3. ^ "Coursera". Coursera. Retrieved 2016-06-16.
  4. ^ Hu, Yuhuang; Loo, Chu Kiong (2014-03-17). "A Generalized Quantum-Inspired Decision Making Model for Intelligent Agent". The Scientific World Journal. 2014. doi:10.1155/2014/240983. ISSN 1537-744X. PMC 3977121. PMID 24778580.

Further reading[edit]