# Rock-Paper-Scissors and Evolutionary Games

### Background

There are two reasons for this post. Firstly, frequency-dependent phenomena are becoming a running theme of this blog. I believe they are responsible for some of the biggest blind spots in how we think about the world. Clearly, cultural norms are frequency-dependent, but even such things as beliefs whether a certain proposition is true or false appear to have a large frequency-dependent component. Secondly, as I was learning about evolutionary game theory as a way to model this stuff, I thought that it would be good to to write an introduction to it, if only to stake out areas that I have acquired a decent grasp of so far.

On hearing the term "evolutionary game theory," many may assume that it is simply game theory applied to evolution (that was certainly my impression of it). But it turns out that evolutionary game theory is quite rich on its own terms, and that it has made its way back into economics (a bit unusual for a toolkit developed for biological problems). The theory got its start when John Maynard Smith had read an unpublished manuscript by George R. Price trying to to explain evolution of ritualized behavior in animals. Maynard Smith thought the question would be better addressed using game theory, and the two had written a joint paper that introduced the concept of evolutionary stability. The most common explanation at the time for why many animals evolve ritualistic displays rather than deadly weapons for intraspecies conflict was that it would be "bad for the species" if too many animals died. This was a group-selectionist argument that made little sense. On the other hand, individual- or gene-level arguments did not seem to provide a satisfactory answer back then either: the first animal to evolve a particularly deadly set of claws would win all conflicts, so why doesn't this always happen?

The most important book on the topic is Evolution and the Theory of Games by Maynard Smith. Besides Maynard Smith's work, important early papers include Taylor and Jonker's that introduced the replicator equation, and Bishop and Cannings' and Harley's that proved important theorems related to evolutionarily stability. Among more recent material, Schecter and Gintis have a good practical introduction that includes classical game theory, Hofbauer and Sigmund focus on dynamics of evolutionary games, while the MIT Press had published an entire series of books attempting to bridge evolutionary game theory and study of human interaction.

As usual, the Mathematica notebooks for this post are available from this repository.

### The Rock-Paper-Scissors Game

#### Motivating Example

The side-blotched lizard is native to Western US and Northern Mexico. The male lizards come in three color morphs: orange, yellow, and blue. The morphs are genetically determined with a Mendelian pattern of inheritance. The gene is pleiotropic and affects the color of the male's throat and as well as behavior. You can read a detailed account of interactions between the three types of males on Sinervo Lab website, but here are the main highlights:

- The orange males are ultra-aggressive, defend large territories, and are polygynous. Their lifespan is significantly shortened either due to their extreme lifestyle or due to other effects of the gene.
- The yellow males are female mimics and their strategy is that of a "sneaky copulator." They invade territories of orange males and deflect the aggression of the proprietors by performing female rejection ritual.
- The blue males tend to be monogamous and engage in intensive mate guarding on a smaller territory compared to orange males. Due to their vigilance, the blues are effective at expelling yellow sneaker males from their territory, but are easily outcompeted by the much more aggressive orange males.

In short, there is a rock-paper-scissors situation: orange beats blue which beats yellow which beats orange. The frequency of each color morph oscillates in 4-6 year cycles.

Similar dynamics were also found in bacteria and among coral reef fauna; they may provide a possible solution to the plankton paradox. The side-blotched lizard, however, is the most spectacular example, such that when Maynard Smith had learned of the discovery of rock-paper-scissors dynamics in this species, he exclaimed: "They have read my book!"

#### The Replicator Equation

We start with the following payoff matrix:

O | Y | B | |
---|---|---|---|

O | ε | -1 | 1 |

Y | 1 | ε | -1 |

B | -1 | 1 | ε |

This is a standard matrix for the Rock-Paper-Scissors game that one can read left-to-right: orange vs orange results in some outcome that is close to zero on average ($\varepsilon$), orange loses to yellow (-1) but wins against blue (+1), and so on.

We want to model the dynamics of player 1 playing a *pure* strategy against a *mixed* one labeled $\sigma$. Player 1 cannot modify his strategy (which is genetically determined) except as a part of a well-defined process that corresponds to biological reproduction. His opponent (player 2) represents the entire population whose composition changes with time. Assuming that the proportions of individuals in the population playing strategies O, Y, and B are $p_1$, $p_2$, and $p_3$ respectively, we get the following payoffs for player 1 against the population:

$$\begin{eqnarray} \pi_1(O, \sigma) &=& \varepsilon p_1 - p_2 + p_3 \\ \pi_1(Y, \sigma) &=& p_1 + \varepsilon p_2 - p_3 \\ \pi_1(B, \sigma) &=& -p_1 + p_2 + \varepsilon p_3 \end{eqnarray}$$

The payoff for the population strategy played against itself is then:

$$\begin{equation} \pi_1(\sigma, \sigma) = \pi_1(O, \sigma) p_1 + \pi_1(Y, \sigma) p_2 + \pi_1(B, \sigma) p_3 \end{equation}$$

When $\varepsilon=0$, the expression above will be equal to zero. However, playing the population strategy against itself need not always result in a zero payoff (for example, the entire population could be growing). So we are interested in the *difference* between the given strategy payoff (played against the population strategy) labeled $\pi_1(s_i,\sigma)$ and the population strategy payoff (played against itself) labeled $\pi_1(\sigma,\sigma)$. In an animal population, this difference should affect reproduction rate. When it is equal to zero, the individuals playing the given strategy $s_i$ should remain at a constant frequency in the population. Furthermore, the rate of increase in frequency of individuals playing $s_i$ should be proportional to the frequency $p_i$ itself:

$$\begin{equation} \dot{p}_i = [\pi_1(s_i, \sigma) - \pi_1(\sigma, \sigma)] p_i,\ i=1,\dotsc,n \tag{1} \end{equation}$$

This is known as the replicator equation, and it is one of the key formulas of evolutionary dynamics. If we substitute each of the three pure strategies (O, Y, B) for $s_i$, we get a system of differential equations to which we can apply standard calculus methods. When $\varepsilon \neq 0$, performing the appropriate substitution in the replicator equation results in fairly long expressions, which I will not show here. Taking $\varepsilon=0$, we get the following system:

$$\begin{eqnarray} \dot{p}_1 = (p_3 - p_2) p_1 \\ \dot{p}_2 = (p_1 - p_3) p_2 \\ \dot{p}_3 = (p_2 - p_1) p_3 \end{eqnarray}$$

Since we know that $p_1 + p_2 + p_3 = 1$, we can get rid of the third equation, after which we have:

$$\begin{eqnarray} \dot{p}_1 = (1 - p_1 - 2p_2) p_1 \label{eq-x}\tag{2a} \\ \dot{p}_2 = (2p_1 + p_2 - 1) p_2 \label{eq-y}\tag{2b} \end{eqnarray}$$

Setting the above expressions to zero and solving the system, we find four possible solutions: (0, 0), (0, 1), (1, 0) and (1/3, 1/3). These correspond to rest points (equilibria) where the speed of evolutionary change is exactly zero. However knowing the location of rest points is not enough—we also want to know the properties of these equilibria.

#### Evolutionarily Stable Strategy

You can skip this section if you like as it will have some formal notation.

The evolutionarily stable strategy (ESS) is a refinement of Nash equilibrium based on the *asymptotic stability criterion*. As a consequence, all ESS are Nash equilibria, but not all Nash equilibria are ESS. Classical game theory is interested in situations when a player can rationally modify her strategy based on some guess about the opponent. For a two-player game, Nash equilibrium is defined as a pair of strategies, deviation from which by any single player can only result in the same or worse payoff.

By contrast, evolutionary game theory assumes the players are not rational, and it is interested in accounting for cases when a small random deviation from the equilibrium (due to genetic drift, for example) leads the population to adopt an alternative strategy (a mutant gene invasion).

This says that for strategy $s_i$ to be evolutionarily stable, there must be a neighborhood around it, however small, where deviations from $s_i$ always result in a worse payoff. The biological meaning of this is that when a population is following an ESS, there is a force of natural selection acting as a barrier against mutant strategies. We can visualize this force as a vector field created by the differential equations of the replicator system.

There is a simple but powerful theorem that can help one find ESS:

In the Appendix B in his book, Maynard Smith uses this result to prove another simple theorem which says that any game with two pure strategies always has an ESS, though the latter may not always be a pure strategy itself.

#### Phase Portraits of Replicator Dynamics

One useful fact about ESS is that in three-strategy games, it can usually be found simply by following the arrows on the vector field that plots the solutions to the replicator equations. Mathematica has a very good utility for making such plots called `StreamPlot`

. I extended it slightly to make ternary diagrams for 3-strategy games (there is also a third-party tool for this but I was unhappy with its renderings so I created my own). Let's see what the corresponding diagrams for the Rock-Paper-Scissors game we analyzed above look like:

The left pane shows the case with $\varepsilon=0$ (the solution to equations $\ref{eq-x}$ and $\ref{eq-y}$), and the right one with $\varepsilon=0.2$. The colors show the speed of change in frequency of the orange morph $p_1$ (red means $p_1$ is increasing while blue means it is decreasing). It turns out there is no ESS in either diagram. If, however, $\varepsilon < 0$ (not shown), then there is an ESS at (1/3, 1/3), which is at the center of the plot. For a detailed analytical treatment of this problem, see Chapter 10 in Schecter and Gintis' book.

Another common way to visualize population dynamics is to plot frequency $p_i(t)$ as a function of time $t$ (in the graph below, $\varepsilon = 0.1$):

If this seems vaguely similar to the Lotka-Volterra predator-prey model, it is not coincidental—the replicator equation has a connection to a generalized version of that model.

It is interesting to speculate about the biological meaning of $\varepsilon$. A positive value might mean that fights between members of the same color morph are less violent compared to those between members of different color morphs (or perhaps individuals within the same color morph derive some other benefit from being around similar individuals). This seems equivalent to a form of *selective altruism* (what the father of gene-centrism W.D. Hamilton had nicknamed "green-beard effect"). It turns out that there indeed appears to exist a form of selective altruism among blue-throated males, which recognize each other from a distance and generally avoid aggressive fighting between each other.

The ever-expanding cycles (that W.B. Yeats would probably call "the widening gyre") are likely to end in extinction of one of the color morphs. Such extinctions indeed had happened in several places of this species' range, though it would not be accurate to blame green-beard effect for them, since other factors, such as genetic drift or asymmetric payoffs could also be responsible.

### Other Games

#### Evolution of Rituals and Ownership

A key game Maynard Smith analyzes in *Evolution and the Theory of Games* is the Hawk-Dove contest:

Hawk | Dove | |
---|---|---|

Hawk | (V−C)/2 | V |

Dove | 0 | V/2 |

This is probably the simplest game that can still lead to interesting results. Consider two animals competing for some resource whose value to the animal is $V$. There are two strategies the animal can adopt: Hawk or Dove (these refer to patterns of behavior, not species). Hawks always attack the competitor, Doves either do nothing or display some threatening posture (but do not attack). When two Hawks meet, they fight and suffer some injury represented by cost $C$. The payoff to each is $(V-C)/2$, because fights end in victory only 50% of the time. When Hawk and Dove meet, Hawk threatens to fight and gets the entire resource for himself, while Dove gets nothing (though does avoid a fight). When two Doves meet, they end up sharing the resource without incurring the cost of fighting, so the payoff to each is $V/2$.

When $V > C$, it is straightforward to find that Hawk is an ESS. But in the real world, fights are expensive. Suppose we set the cost of fighting to be twice that of the benefit of the resource. We can use Theorem 1 to show that, in this case, there is an ESS $(½H, ½D)$, i.e. the optimal strategy is to play Hawk 50% of the time, and play Dove the rest of the time.

Having explained the basic Hawk-Dove game, let's make things more interesting by adding a third strategy. Consider the following two payoff matrices:

H | D | R | |
---|---|---|---|

H | -1 | 2 | -1 |

D | 0 | 1 | 0.9 |

R | -1 | 1.1 | 1 |

H | D | B | |
---|---|---|---|

H | -1 | 2 | 0.5 |

D | 0 | 1 | 0.5 |

B | -0.5 | 1.5 | 1 |

They both start with the same 2x2 Hawk-Dove template ($V=2, C=4$) but each adds a third strategy: "Retaliator" on the left and "Bourgeois" on the right. A Retaliator is someone who can fight but usually chooses not to. When Retaliator plays against Hawk, the game turns into Hawk vs Hawk: the opponent always initiates a fight, so the payoff is -1 to each. When Retaliator plays against Dove, the payoff is similar to Dove vs Dove except sometimes Retaliator figures out that Dove is bluffing and hence has a small advantage. Finally, when Retaliator plays against another Retaliator, they perform some convincing display which causes them to act like Doves and avoid a fight. Maynard Smith introduced Retaliator strategy to show that he could prove Price's thesis: that the threat of retaliation could be a driving force behind evolution of many forms of ritualized behavior.

Bourgeois is similar to a mixed Hawk-Dove strategy (thus payoffs for all pairs except Bourgeois vs Bourgeois are averages of the corresponding rows or columns in the 2x2 matrix). This is because Bourgeois (who acts as an owner of the resource, hence the name) will, on average, "own" the resource only 50% of the time. When Bourgeois plays against another Bourgeois, one acts like a Dove while another acts like a Hawk, so the payoff is 1 on average. The idea behind introducing Bourgeois strategy is to ask whether something like "ownership" could evolve in animals. How do two Bourgeois decide who is the owner? (I.e. who gets to become Dove and who gets to become Hawk?) They could do this based on some asymmetry: by evaluating each other's fighting ability, or simply based on who arrived to the resource location first.

Below I plot the two matrices above:

On the left, we have two ESS: Retaliator and $(½H, ½D)$. On the right, there is only one ESS: Bourgeois. Thus, a rather simple model can illustrate how ownership-like strategies can evolve given a very simple set of behavioral assumptions. The "breaking of symmetry" as Maynard Smith puts it (and which enables the Bourgeois strategy) is extremely common in the animal world. In his book, Maynard Smith cites several elegant experiments on butterflies which showed that even these simple insects act in a way that implies they assign ownership to sunlit patches in the forest on first-come basis.

### Last Words

The lizard example is sometimes used in support of existence of genders among animals. But it is quite obvious that these are not genders but different male strategies. The yellow males easily out-reproduce orange ones under right conditions, which means their imitation of female rituals does not imply they want to copulate with orange males.

Though it may seem like an obvious point, it may be worth reiterating that most biological species are likely to have already converged on some ESS and hence their populations (evolutionarily speaking) are stable. In other words, just because other species do not exhibit the same cycle as the side-blotched lizard does does not imply that evolutionary game theory does not apply to them.

It may be interesting to consider whether, when a species plays an ESS, it means that the ESS is genetically determined or developmentally acquired. There is a certain elegance to evolutionary game theory in that from its point of view, it may not matter (at least in most cases). Harley had proven a theorem (also discussed in Chapter 5 of Maynard Smith's book) that if an evolutionarily stable learning rule exists, it is necessarily a learning rule for ESS.

This concludes my first foray into evolutionary game theory.