Ridere, ludere, hoc est vivere.

Saturday, April 1, 2017

Notes on Games with Sequential Moves

In this second post in a series exploring games of strategy (begun last month), designers Aaron Honsowetz, Austin Smokowicz, and I explore strategic games involving sequential moves, i.e. those in which each player's decision happens in the context of knowing opponents' previous decisions.  This exploration has its foundation in Chapter 3 of Dixit, Skeath, and Reiley's Games of Strategy.

We recorded our discussion at UnPub 7 in Baltimore, Maryland, on Friday 17 March.

Game tree for "Battle of the Sexes"
Source: "Managerial Economics
Online," kwanghui.com/mecon
A sequential game may be illustrated by a network of decision points, at each of which a choice is made.  A game tree illustrates decision points (nodes) for all players and directional branches from those points to successive nodes or to an end game state (a terminal node).  The game starts at an initial node (on the left in this illustration).  Besides players' decisions, a node may represent an event of external uncertainty, i.e. a point that may branch in several directions due to factors outside the players' control, such as a random event.  Games (e.g. role-playing games) don't necessarily have end states, but for those that do, the payoff for each player appears at each terminal node.

Our discussion opened with a hypothetical game in which first Aaron and then Austin decides whether to go out to get ice cream.  Aaron prefers ice cream but more important than that prefers not to spend time with Austin.  Austin prefers ice cream.  Knowing Austin's preferences (his "payoff"), Aaron "prunes" from the tree those decision branches that he knows Austin will not take, leaving Aaron with a strategy to stay home and avoid Austin's company.  Austin will then go out and get ice cream.

In the sample illustration for "Battle of the Sexes," a wife and a husband are choosing whether to go to the opera or to the bullfight.  The wife prefers the opera with the husband (payoff 5), but would rather go to the bullfight with her husband (4) than to the opera alone (3).  Going to a bullfight alone is as bad as staying home (0).  The husband's payoffs are the same except that he prefers the bullfight over the opera.  In this sequential game, the wife chooses first, and then knowing the wife's choice, the husband chooses second.  This is not a zero-sum game, and the wife's payoffs don't necessarily compare to the husband's.  Rather, each seeks to maximize her or his own payoff in isolation. So if the wife chooses "opera (O)," the husband would rather go to the opera with his wife than to the bullfight alone (even if they would be equally miserable alone), and so chooses to go to the opera (O) as well. (Later we'll see how the social concept of "fairness" can undermine game tree analysis.) 

Making a choice at a single node is called a move.  A plan of action that governs moves in the interest of maximizing payoff is called a strategy.  In the case of the Battle of the Sexes, since each player has only one move, the move and the strategy are essentially the same.  For tic-tac-toe, a player can envision an entire game tree and can devise a strategy that maximizes his or her payoff (which will generally be zero, since optimum play by both players results in a draw).  For chess, the game tree is too complex for the human brain to analyze fully, so strategic play generally involves analysis of a "foreseeable portion" of the game tree.

The authors insist that a player's strategy be complete, i.e. that it include contingencies for nodes that perfect analysis says might be unnecessary.  It's not clear in Chapter 3 why that's the case, but they refer to stability analysis in advanced game theory, and so we accept that a well-defined strategy for a player is a complete plan of action for all the player's decision nodes.

Interestingly, the authors stipulate that every node have only one branch leading to it.  I find that interesting, inasmuch as it is possible for different decision sequences to result in a game state that poses the same player with the same decision and the same consequences.  If they are treated as different nodes, then the downstream game tree branches will be identical for both.  That seems an unnecessary restriction, but I don't see it as problematic, and in fact it may simplify game tree analysis.

An approach to solving a game tree, i.e. establishing a strategy for each player and predicting likely courses of a game, is to sequentially prune branches that represent decisions that are categorically not in the interest of the deciding player.  So for example in the case of the Battle of the Sexes, at node b, the husband is faced with a payoff of 4 for the opera (O) or 3 for the bullfight (BF).  So the BF branch may be pruned because in that case the husband will not choose BF.  At node c, the husband is faced with payoffs of 0 (O) and 5 (BF), so that O branch may be pruned.  The result is that now at node a, the wife's choices have payoffs of 5 (O), because she knows that the husband will also choose O in that case, or 4 (BF), because the husband will choose BF.  So at node a, the lower payoff BF branch may be pruned, and the game tree has a single path in which both partners choose the opera. 

Instead of pruning, a different way to represent the same technique is to highlight clearly optimal choices, such as the O choice at node b and the BF choice at node c.  Then highlight the O choice at node a to complete a path from the initial node to the terminal node.  In either case, this method of working backward is called rollback, or backward induction.  When all players use rollback to arrive at a strategy, the resultant set of strategies is a rollback equilibrium, and the resultant terminal node is the rollback outcome. The authors stipulate that every sequential game has a rollback equilibrium - typically exactly one, except where players have equal payoffs between two choices and are indifferent to the result.

Depending on the configuration of the game tree (and the Battle of the Sexes is an example), the rollback equilibrium may result in the player at the initial node gaining the maximum possible payoff while other players gain less than their maximum.  A game tree with such a configuration has first-mover advantage.  In other cases, when reacting to the first decision yields the maximum payoff (as it would if rock-paper-scissors were a sequential game), the game is said to have second-mover advantage.

Although chess is a sequential perfect information game that can be theoretically represented and solved with a game tree, its scope exceeds both human and artificial computation.  An approach to such a game is to assign values to certain characteristics of game states as a guide to decision-making.  A rule that assigns such values is called an intermediate valuation function.  Such a function typically derives from past experience of previous games; for example, maintaining a queen advantage typically results in a win.  Chess analysis has led to a variety of known openings - early decision sequences - that have demonstrated intermediate value.  Once the endgame is reached with fewer pieces remaining, a rollback analysis may apply.  Checkers, by contrast, was solved in July 2007 and, perfectly played, will always result in a draw.

Rollback analysis presupposes a clear understanding of the value of the payoffs to rational players.  An "Ultimatum Game" gives 100 coins to Player A, who offers some of them to Player B.  If Player B accepts the offer, both keep their shares; if not, neither keeps anything.  Rollback analysis dictates that Player A offer one coin and B accept, but in social experiments, observation indicates Player A typically offers much closer to 50 and B accepts only if offered something that appears fair.  So empirically, players value fairness in a way that the originally constructed game tree does not reflect.  When the experiment is posed in a way that Player B does not know how the offer was derived (e.g. whether it was random, by computer, by Player A, or by some other method), Player B typically accepts much lower offers because the sense of fairness is less offended by an arbitrary opportunity.  Going back to the "Battle of the Sexes," if the same game is played every week and the couple goes to the opera every week, the husband might - out of a sense of "fairness" - deliberately go to the bullfight alone rather than continue to accede to his wife's preferred event.

It will be interesting to see how rollback analysis might apply to social games where signaling might induce opponents down one branch over another.  For example, if the husband signals an intent to go to the bullfight regardless of the wife's decision (and if the signal is credible), the wife may choose to go to the bullfight rather than risk going to the opera alone.

We also discussed the game theory analysis of the last immunity challenge in the first season of the TV show Survivor, in which one player (Richard) evaluated the game tree and decided to step out of the immunity challenge (to the utter surprise of everyone watching) because his analysis indicated that he was likely to advance to the final tribal council regardless of which of his opponents won the immunity challenge.

So we can approach sufficiently simple sequential games with game tree analysis as long as we correctly value payoffs.  Game tree analysis still provides insight (if not explicit solution) for more complex games.  Irrational play may undermine strict analysis, so strategies need to take into account unexpected paths that the game might follow.

Next time we will discuss Games of Strategy Chapter 4, "Simultaneous Move Games with Discrete Strategies."

Wednesday, February 22, 2017

Notes on Games of Strategy

Over three years ago, I wrote about my effort to approach a simple three-player race game using game theory.  Economist and game designer Dr. Aaron Honsowetz responded, which led to his recommendation that I look up the book Games of Strategy by Avinash Dixit, Susan Skeath, and David Reiley.  I finally obtained the third edition recently, and that has led Aaron, fellow designer Austin Smokowicz, and I to explore Dixit Skeath and Reiley's text in a kind of virtual book club.

Thinking about strategic games

We live-streamed last night's discussion.  We start with the first two chapters, which motivate the study of game theory and then define some terms and categories. 

Strategic games depend on player decision-making, as distinguished from games of chance, which depend on luck, and from games of skill, which depend on proficiency, dexterity, quickness of mind, or practice.  So, for example, chess is a game of strategy whose outcome is determined by the decisions of the players.  Bingo is a game of chance, whose outcome is determined by the order of randomly selected numbers.  Bowling is a game of skill, whose outcome is determined by the strength, accuracy, and proficiency of the bowlers.

The authors distinguish decisions that people make that are independent of the decisions of others from strategic games, in which people know that the results of their decisions depend on the decisions of others as well, i.e. that their decisions are interactive.  So, for example, blackjack plays out based on the decisions of each player in isolation, regardless of the other players at the table and of the dealer, who follows strict rules and makes no independent choices.  Poker, by contrast, plays out based on the interaction of the decisions among players.  So poker meets the definition of a strategic game, while blackjack does not.

The authors proceed to classify games in anticipation of the structure of the rest of the book:
  • In sequential games, players make decisions whose immediate outcome is unaffected by other players, as opposed to games with simultaneous moves, in which players must anticipate the unknown decisions of other players.  Chess is sequential, while rock-paper-scissors is simultaneous.
  • Some games pose players with strictly conflicting interests while others might involve common interests among the participants.  So most wargames are strictly conflicting, while Dead of Winter introduces both common and individual goals.
  • One-time games are distinguished from repeated games with the same opponents, which in turn are distinguished from games involving changing opponents.  So a one-time-only game might be the courtship, engagement, and marriage of a couple.  A bridge club plays the same game with the same opponents repeatedly.  And a single-elimination game tournament generally involves playing the same game with different opponents in each session.
  • The authors define games with full and equal information vs partial or unequal information:  External uncertainty applies to unknown information independent of others' decisions.  Strategic uncertainty applies to the unknown decisions by others.  A game with either or both has imperfect information; a game with no such uncertainty has perfect information.  A game in which one player has more information than another has asymmetric information.  So for example, a card game with a shuffled deck involves external uncertainty.  Rock-paper-scissors involves strategic uncertainty.  Chess is a perfect information game.  Scotland Yard has asymmetric information.  
Strategies that share or reveal information deliberately are called signaling.  Strategies that seek to motivate an opponent to reveal information are called screening. We spent quite a bit of time discussing examples of signaling, such as playing chicken and throwing the steering wheel out the window to demonstrate to an opponent your commitment not to swerve.  Aaron cited an example from Euchre in which card play signals to a partner information about your hand or about what the opponents do or don't have.  To Aaron's point, the important component of signaling is the degree of commitment to a decision.  Austin used an example from Hanabi as a method of communicating intent to fellow players.
 After the chat, Aaron further refined our discussion of signaling: 
Bluffing itself is an uncreditable signal. You claim you are committing to something (to alter player behavior).  To be a creditable signal, the price to make the signal must be sufficiently high that only a person committed to the action is willing to pay it. When I remove my steering wheel and toss it out the window I am saying I am willing to drive straight no matter the price. If the price of making the signal is too low, than people can make the signal without creditably committing to the action (bluffing) and destroy the ability for the signal to indicate you are going to take a particular action.
Screening involves eliciting information, and one way of doing that is to make an offer like a trade in Catan to see how players respond and thereby gain information about the contents of their hand and to an extent their future intentions.
During our discussion, I provided an example of sequential interactive decision-making that led to another follow-up by Aaron regarding signaling and screening.  In a game of Agricola, I built fences in anticipation of an opportunity to take sheep.  An opponent, who had no need of sheep and no means to store or cook them, took them anyway and let them run free to deprive me of that opportunity.  Said Aaron,
Your sheep example from Agricola was (if there was no other value for that many fences) a credible commitment that you would take sheep if they were available.  If that is the case it could also be part of a screening tactic.  If everyone is paying attention, it forces the last person before you who has a lower value play (assuming you are competitive in the game) to reveal that by responding to you.  And while you may not get the points from the sheep, it may still have been the best move because it blocked an opponent from taking an action to advance their score.
  • Games can have fixed vs manipulable rules.  Most tabletop games with which we are familiar have fixed rule-sets.  I imagine games with manipulable rules to include the interactions within a legislative body, whose rules of order may be modified by a party in power. 
  • The book defines cooperative games differently from the conventional use of the term that modern game-players might know.  The authors refer to games in which agreements are enforceable as Cooperative, while games with unenforceable agreements and that allow players to act in their own best interests are called non-cooperative games.  By these definitions, Catan is cooperative, inasmuch as a trade is an enforceable agreement; both parties are required by the game rules to hold up their end of the bargain.  Diplomacy is a non-cooperative game, since a player can commit to a future action and then renege on that commitment. 
Terminology and other concepts:
  • Strategies are available choices, or more generally a set of guidelines or algorithms by which individual decisions are made - a plan for a succession of actions in response to evolving circumstances, presumably due to the actions of other players.
  • Payoffs are the outcomes of interactive decisions, including expected payoff based on a probability distribution of random outcomes.
  • Rationality, or rational behavior, assumes perfect calculating players that consistently follow the best strategy pursuant to a completely known self interest.
  • Games involve a common knowledge of rules, specifically knowing who the players are, their available strategies or choices, the payoffs for each interaction of strategies, and an assumption of rational behavior.
  • When rational players interact, the game reaches an equilibrium where by each player is using the strategy that best responds to the other players' strategies. 
  • As opposed to assumed perfect rationality and calculated equilibrium, an evolutionary approach to games allows for a dynamic process in which poor calculators are motivated to choose strategies that proved more successful in previous plays of the game through observation, imitation, and learning.
  • Observations and experiments can help structure game theory and provide a check against its results.
The book stipulates that game theory can help to explain observed behavior of interacting decision makers, predict likely choices of rational actors, and prescribe strategic decisions.

I had one question that did not get addressed in our chat:  Can a game have more than one equilibrium, i.e. can there be local points of optimization that could emerge in an evolutionary approach different from what an analysis of perfect strategies would indicate?  The answer may come up in later discussions.

Next we will explore Chapter 3, "Games with Sequential Moves."