In an earlier post regarding theory of simultaneous move games, I concluded with an example of a game between two tennis players that did not demonstrate a Nash equilibrium between its two pure strategies. Sam Hillier: Consulting Philosopher more recently elaborated on the topic with an excellent post on mixed strategies. Whereas I had
approached the question of an equilibrium for a single tennis shot and
concluded that none existed, a tennis match of course includes many
shots, so players have an opportunity to invoke a weighted mix of shots and
defenses between the two options.
To recap, the two-player zero-sum simultaneous-move game represents a tennis volley in which one player - the defender - decides whether to defend a
shot down the line (D) or cross court (C) at the same time the opponent - the shooter - decides which type of shot to attempt. The payoff is depicted as the
probability of the shot scoring a point in a 2x2 matrix cross-referencing each player's decision with the other's. The shooter seeks to maximize that result; the defender wants to minimize it.
Sam stipulated that in a game of mixed strategies between C and D, “the server will shoot C 30% of the time … while the defender will defend against
C 40% of the time,” but didn't explain where those mixed strategy results came from. There’s a little arithmetic in figuring that out that arises
from approaching the problem by trying to “minimize the maximum loss.”
Suppose p
represents the frequency with which the shooter will select C. The shooter
wants to maximize the worse case score regardless of the defender’s choice. If
the defender chooses C, then the shooter’s expected value of the result will be
EC = 0.2p +
0.8(1-p)
If the defender chooses D, the shooter’s expected result is
ED = 0.9p +
0.5(1-p)
To avoid the defender exploiting one better result over
another, the shooter will find the frequency p that gives the same result regardless of the chosen defense by
setting the expected values equal to each other:
EC = ED
0.2p + 0.8(1-p) = 0.9p + 0.5(1-p)
p = (0.8 – 0.5) / (0.9 – 0.5 – 0.2 + 0.8)
= 0.3/1.0
= 30%
This result means that if the shooter chooses C 30% of the
time, their expected score will be
EC = 0.2(0.3) + 0.8(1-0.3) = 0.62
or
ED = 0.9(0.3) + 0.5(1-0.3) = 0.62
regardless of the defender’s choice. If the shooter chooses
C less frequently (i.e. p < 30%),
the defender can always choose D and cause a lower shooter score. Similarly, if
the shooter chooses C more frequently, the defender can always choose C and again
cause a lower shooter score. So, the shooter’s best mixed strategy is to choose C 30% of
the time.
Similar analysis – whereby the defender attempts to minimize
the maximum possible shooter score – results in the defender choosing C
q = (0.9 – 0.5) / (0.8 – 0.5 – 0.2 + 0.9)
= 0.4/1.0
= 40%
of the time.
Sam also discusses the idea of Evolutionary Game Theory, which interests me
from the standpoint that superior strategies can emerge in a population of game
players without necessarily invoking explicit analysis. In fact, most games are
sufficiently complex that exhaustive analysis is not possible, and the
evolutionary strategies that emerge may demonstrate consistent success without
a complete understanding of why they succeed. I expect Dr. Wictz and I will discuss that topic at some point in the future as well.
No comments:
Post a Comment