Evolutionary Games On The Lattice: Best-Response Dynamics
Evolutionary Games On The Lattice: Best-Response Dynamics
Evolutionary Games On The Lattice: Best-Response Dynamics
best-response dynamics
1. Introduction
The framework of evolutionary game theory, which describes the dynamics of populations of individuals identified to players, has been initiated by theoretical biologist Maynard Smith and first
appeared in his work with Price [7]. Each individual-player is characterized by one of a finite number n of possible strategies and is attributed a payoff that is calculated based on the strategy of
the surrounding players and an n n payoff matrix. The most popular model of evolutionary game
is probably the so-called replicator equation reviewed in [3], a system of deterministic differential
equations for the frequencies of players holding a given strategy. This paper is a sequel of the
second authors work [5] continuing the analytical study of evolutionary games based on the framework of interacting particle systems which, in contrast with the replicator equation, also includes
stochasticity and space in the form of local interactions.
Model description The version of the best-response dynamics we consider in this paper is a
continuous-time Markov chain whose state at time t is a spatial configuration
t : Zd {1, 2} := the set of strategies.
In words, each point of the d-dimensional square lattice is occupied by exactly one player who
is characterized by her strategy. The spatial structure is included in the form of local interactions
assuming that each players payoff only depends on the strategy of her 2d neighbors. More precisely,
having a two by two payoff matrix A = (aij ) where aij is interpreted as the payoff of a player holding
strategy i interacting with a player holding strategy j, each configuration is turned into a so-called
Both authors were partially supported by NSF Grant DMS-10-05282
AMS 2000 subject classifications: Primary 60K35, 91A22
Keywords and phrases: Interacting particle systems, bootstrap percolation, evolutionary stable strategy.
for all
x Zd
x Zd
x Zd
(1)
be the payoff that the player at x would receive if she followed strategy 1 and 2, respectively, the
best-response dynamics is formally described by the Markov generator
P
x,1
Lf (t ) =
x 1{1 (x, t ) > 2 (x, t )} [f (t ) f (t )]
(2)
P
x,2
+
x 1{1 (x, t ) < 2 (x, t )} [f (t ) f ()]
where the configuration tx,i is obtained from t by setting to i the strategy at x and leaving the
strategy at the other vertices unchanged. Note that, for any given vertex x, the difference between
the two alternative payoffs in (1) can be written as
1 (x, t ) 2 (x, t ) = (a11 N1 (x, t ) + a12 N2 (x, t )) (a21 N1 (x, t ) + a22 N2 (x, t ))
= (a11 a21 ) N1 (x, t ) (a22 a12 ) N2 (x, t ).
In particular, the dynamics only depends on a1 := a11 a21 and a2 := a22 a12 rather than all
four coefficients of the payoff matrix so the Markov generator (2) can be written as
Lf (t ) =
(3)
Since the behavior of the system strongly depends on the sign of a1 and a2 , it is convenient to use
the terminology introduced in [4, 5] by declaring strategy i to be
altruistic when ai < 0, meaning that a player with strategy i confers a lower payoff to a
player following the same strategy than to a player following the other strategy,
Best-response dynamics
selfish when ai > 0, meaning that a player with strategy i confers a higher payoff to a player
following the same strategy than to a player following the other strategy.
Mean-field approximation To understand the role of space in the long-term behavior of the
best-response dynamics, the first step is to look at the deterministic nonspatial version, or meanfield approximation, of the process (3). This mean-field model is obtained under the assumption
that the population is well-mixing, and more precisely by looking at the process on the complete
graph in which any two players are neighbors and then taking the limit as the number of vertices
tends to infinity. This results in a system of differential equations for the frequency of players
holding strategy i that we denote by ui . In the absence of a spatial structure, the payoff that a
player would receive if she followed strategy 1 and 2, respectively, is
1 (u1 , u2 ) = a11 u1 + a12 u2
and
which can be viewed as the nonspatial analog of (1). Also, under the evolution rules of the bestresponse dynamics, either each type 1 player or each type 2 player changes her strategy at an
exponential rate one depending on whether 1 2 is negative or positive, respectively. Then,
rescaling time by the number of vertices and taking the limit as the number of vertices tends to
infinity gives the following differential equation for the frequency of type 1 players:
u1 (t) = u2 1{1 (u1 , u2 ) > 2 (u1 , u2 )} u1 1{1 (u1 , u2 ) < 2 (u1 , u2 )}
= u2 1{a1 u1 > a2 u2 } u1 1{a1 u1 < a2 u2 }
(4)
for all
x Zd .
By symmetry, strategy 2 wins whenever strategy 1 is altruistic and strategy 2 selfish. Note in
particular that the all 1 and all 2 configurations are not necessarily absorbing states for the
process. This is due to the fact that, though the new strategy is chosen based on the strategy of the
neighbors, it is not chosen from the neighborhood. Looking now at altruistic-altruistic interactions,
whenever the player at x and all her neighbors follow the same strategy,
a1 N1 (x, t ) a2 N2 (x, t ) = + 2d a1 < 0 when t (x) = 1
a1 N1 (x, t ) a2 N2 (x, t ) = 2d a2 > 0 when t (x) = 2.
In either case, the player at x changes her strategy at an exponential rate one, indicating that, as
in the mean-field model, two altruistic strategies coexist in the sense that
limt P (t (x) = t (y)) < 1 for all
x, y Zd , x 6= y.
We now study the process when both strategies are selfish, a case more challenging mathematically
and also more interesting as it shows some important disagreements between the spatial and nonspatial models. To confront our results for the spatial model with the bistability displayed by its
nonspatial counterpart, we consider the process starting from the product measure with
P (0 (x) = 1) =: p for all
x Zd
and compare the models when p = u1 (0). The fact that the inclusion of space in the form of local
interactions strongly affects the long-term behavior of the system can be seen in a specific parameter
region using a standard coupling with the Richardson model [8]. Indeed, let
c(x, t ) := limh0 P (t+h (x) 6= t (x) | t ).
(5)
Best-response dynamics
Figure 1. Best-response dynamics on a 300 300 lattice with periodic boundary conditions starting from a product
measure with density p of type 1 players in black. On the left picture, the process hits an absorbing state in which
both types are present, whereas on the right picture, which shows a snapshot of the process at time 25, the system
is converging to the all black configuration: strategy 1 wins.
is evolutionary stable for the spatial model. Returning to general selfish-selfish interactions, the
numerical simulations of the two-dimensional process displayed in Figure 1 suggest that, when a1 is
slightly larger than a2 and the initial density p > 0 is small, the system fixates to a configuration in
which the set of type 1 players consists of a union of disjoint rectangles, indicating that strategy 1
is unable to invade strategy 2. These simulations, however, are misleading due to the finiteness of
the graph, and it can be proved that, in any dimensions, the most selfish strategy always wins even
when starting at a low density. More precisely, we have the following theorem.
Theorem 1 Assume that a1 > a2 > 0 and p > 0. Then,
limt P (t (x) = 1) = 1
for all
x Zd .
In particular, while any selfish strategy is evolutionary stable in the nonspatial model, only the most
selfish strategy is evolutionary stable in the spatial model. The result in one dimension directly
follows from our coupling with the Richardson model since
(2d 1) a2 = a2
when d = 1
while the general result relies on a combination of monotonicity results and coupling arguments
to compare the best-response dynamics with bootstrap percolation. More precisely, we first prove
that, in the presence of selfish-selfish interactions, the best-response dynamics is attractive, which
allows to focus on the process starting from a certain reduced configuration that consists of a union
of hyperrectangles. The second ingredient is to show that, for the process starting from this reduced
configuration, the set of type 1 players is a pure growth process, just like the Richardson model.
This strong monotonicity result is then applied repeatedly to show that the best-response dynamics
properly rescaled in space dominates stochastically bootstrap percolation with parameter d. From
this domination and a result due to Schonmann [9, Theorem 3.1], we finally deduce that, unlike
what Figure 1 suggests, the most selfish strategy indeed invades the entire lattice.
2. Some monotonicity results
To avoid cumbersome notations, it is convenient to sometimes think of the state of the process as
a subset rather than a function by using the identification:
t {x Zd : t (x) = 1} Zd .
One key ingredient is to think of the process as being constructed from a so-called Harris graphical
representation [2] which, in the case of the best-response dynamics, reduces to a collection of
independent Poisson processes. More precisely,
for each x Zd , we let (Nt (x) : t 0) be a rate one Poisson process and
we denote by Tn (x) := inf {t : Nt (x) = n} its nth arrival time.
The configuration at time t := Tn (x) is obtained from t by
adding x when a1 N1 (x, t ) > a2 N2 (x, t )
removing x when a1 N1 (x, t ) < a2 N2 (x, t ).
An argument due to Harris [2] implies that the best-response dynamics starting from any initial
configuration can indeed be constructed using this rule. The next lemma shows that, in the presence
of selfish-selfish interactions, the best-response dynamics is attractive.
Lemma 2 The process with a1 > 0 and a2 > 0 is attractive:
P (x t ) P (x t ) whenever
0 0 .
a2 N2 (x, t ) a2 N2 (x, t ).
(6)
Let c(x, t ) be defined as in (5). Using (6), we obtain that, for all x t ,
c(x, t ) = 1 {a1 N1 (x, t ) < a2 N2 (x, t )}
1 {a1 N1 (x, t ) < a2 N2 (x, t )} = c(x, t ).
(7)
(8)
The inequalities (7)(8) show that condition (B14) in Liggett [6] are satisfied, which proves that,
in the presence of selfish-selfish interactions, the process is attractive.
Best-response dynamics
In addition to attractiveness, a key ingredient to prove our theorem is to replace the initial configuration 0 with a specific reduced initial configuration 0 . To define this new initial configuration,
we introduce the following collection of hypercubes:
Hz := 2z + {0, 1}d
for all
z Zd .
(9)
(10)
In words, while t represents the set of vertices following strategy 1, configuration (t ) can be
seen as the set of vertices that will become or stay of type 1 at the next update provided the
configuration in their neighborhood does not change by the time of the update. Note that, due to
the presence of selfish-selfish interactions: a1 > 0 and a2 > 0, we have
t t implies that N1 (x, t ) N1 (x, t ) and N2 (x, t ) N2 (x, t )
implies that a1 N1 (x, t ) a2 N2 (x, t ) a1 N1 (x, t ) a2 N2 (x, t )
(11)
implies that (t ) (t )
indicating that the function is nondecreasing. In addition, for any configuration 0 obtained by
reduction of an arbitrary initial configuration using the partition into hypercubes, since each type 1
player has at least d type 1 neighbors and a1 > a2 > 0, we also have
x 0 implies that N1 (x, 0 ) d and N2 (x, 0 ) d
implies that a1 N1 (x, 0 ) > a2 N2 (x, 0 )
(12)
implies that x (
0 )
indicating that 0 (
0 ). Monotonicity (11) and the generalization of (12) to all times are the
main two ingredients to establish the lemma that we prove by induction. Since the lattice is infinite,
the time of the first update does not exist. Also, in order to prove the result inductively, the next
step is to use an idea of Harris [2] to break down the lattice into finite islands that do not interact
with each other for a short time. More precisely, we do the following construction:
we let > 0 be small and, for each vertex x such that T1 (x) < , draw a line segment between x
and each of its 2d nearest neighbors.
This construction naturally induces a partition of the lattice into clusters, where two vertices belong
to the same cluster if there is a sequence of line segments connecting them. In addition, since the
probability of two neighbors x y being connected by a line segment
P (there is a line segment between x and y)
= P (min(T1 (x), T1 (y)) < ) = 1 e2
can be made arbitrarily small by choosing time > 0 small, Theorem 1.33 in [1] implies that there
exists > 0 small, fixed from now on, such that each cluster is almost surely finite. Letting A be
an arbitrary, necessarily finite, cluster, we have the following two properties:
(a) the configuration in A at time only depends on the initial configuration of the process and
its graphical representation restricted to the cluster A.
(b) whenever (x A and Nx 6 A) or (x Ac and Nx 6 Ac ) where Nx refers to the interaction
neighborhood of vertex x, the strategy at x is not updated before time .
Now, since A is finite, the number of updates in A up to time is almost surely finite and therefore
can be ordered. Let the times of these updates and their corresponding locations be
s0 := 0 < s1 < s2 < < sm <
and x1 , x2 , . . . , xm A.
if and only if
x1 (
0 ).
so (
s0 A) (
s1 A) ((
s0 ) A).
(13)
(
s1 A) ((
s1 ) A).
(14)
The last inclusion in (14) allows us to repeat the same reasoning to get (13)(14) at the next update
time, and so on up to time sm . Using in addition the obvious fact that the configuration in the
cluster A does not change between two consecutive updates implies that the property to be proved
holds at all times smaller than so we have
(
s A) (
t A)
and (
t A) ((
t ) A) for all
s < t .
(15)
This only proves the result for the process restricted to A and up to time . To extend the result
across the lattice and for all times, we first use that the set of all the clusters forms a partition of
the lattice and sum (15) over all the possible clusters:
S
S
s A) A (
t A) = t for all s < t
s = A (
(16)
S
S
= A (
A) A ((
) A) = (
).
This first inclusion proves the lemma up to time while the second inclusion can be used, together
with the fact that the process is Markov, to restart the argument and extend the result inductively
up to time 2, then 3, and so on. This proves the result at all times.
Best-response dynamics
where 0 = empty
and
1 = occupied
and
:= limt t
exist.
Here, we again identify configurations with the set of vertices in state 1. From now on, we call
the two limit sets above, the infinite time limits of the sparse best-response dynamics and
bootstrap percolation, respectively. To prove the theorem, we first rely on the monotonicity results
of the previous section to show that the infinite time limit of the sparse best-response dynamics
properly rescaled in space dominates its counterpart for bootstrap percolation. The main ingredient
is to couple both systems using the key function introduced in (10). Based on this coupling, we
can directly deduce the theorem from its analog for bootstrap percolation on the infinite lattice
starting from a product measure, a result due to Schonmann [9, Theorem 3.1].
Lemma 4 Assume that a1 > a2 > 0. Then,
n (
s )
s > 0 and
n 0.
(17)
since x (n (
s )) \ n (
s )
(18)
(19)
10
for all t > x . Combining (18)(19) and using that a1 > 0 and a2 > 0, we get
a1 N1 (x, t ) a1 N1 (x, n (
s ))
> a2 N2 (x, n (
s )) a2 N2 (x, t ) for all
t > x .
It follows that, given that the player at vertex x follows strategy 2 after time x , she switches to
strategy 1 at rate one. This together with (17) implies that
Tx = inf {t > 0 : x t } < a.s.
therefore x .
(20)
Finally, using consecutively (11) and (16) and then (20), we deduce that
n (
s ) (n (
s )) = n+1 (
s )
and n+1 (
s ) = (n+1 (
s ) \ n (
s )) n (
s )
which shows the result at step n + 1 and completes the proof.
We are now ready to prove that the infinite time limit of the best-response dynamics properly
rescaled in space dominates the infinite time limit of bootstrap percolation. More precisely, we look
at the best-response dynamics viewed at the hypercube level by introducing
t : Zd {0, 1}
z Zd .
(21)
From now on, we call this process the hypercubic best-response dynamics. Identifying once more
configurations with the set of vertices in state 1 and using again the monotonicity of the sparse
best-response dynamics given by Lemma 3, we note that
:= limt t = limt {z : Hz t }
= {z : Hz limt t } = {z : Hz }
therefore the infinite time limit is well-defined.
Lemma 5 Assume that a1 > a2 > 0 and m = d. Then,
0 = 0 .
and
card {w z : s (w) = 1} m = d.
(22)
Recalling (21), this indicates that there are at least m = d hypercubes adjacent to Hz that are
completely occupied by players of type 1. Invoking the invariance by symmetry of the best-response
dynamics, we may assume without loss of generality that
Hzej s
for
j = 1, 2, . . . , d
(23)
(24)
11
Best-response dynamics
Combining (23)(24) together with Lemma 3 and some basic geometry, we get
P
2z + {x {0, 1}d : j=1,2,...,d xj < n} n (
s ) for n = 1, 2, . . . , d + 1.
(25)
For an illustration in three dimensions, we refer to Figure 2 where configuration s consists of the
union of three hypercubes. In particular, taking n = d + 1 gives
P
Hz = 2z + {x {0, 1}d : j=1,2,...,d xj d} d+1 (
s ).
Applying Lemma 4, we then obtain
Hz d+1 (
s )
(26)
In addition, since the hypercubic process clearly inherits the monotonicity property of the sparse
best-response dynamics given by Lemma 3,
s (z) = 1 implies that t (z) = 1
for all
t > s.
(27)
In summary, (27) and the fact that (22) implies (26) indicate that: for the hypercubic process, once
a vertex is occupied it remains occupied forever, and if an empty vertex has at least d occupied
neighbors then it becomes occupied after an almost surely finite time. Recalling the evolution rules
of bootstrap percolation with parameter m = d, the result follows.
Combining the previous lemma with a result of Schonmann [9, Theorem 3.1] on bootstrap percolation on the infinite lattice, we now deduce the theorem.
Lemma 6 Assume that a1 > a2 > 0 and p > 0. Then,
limt P (t (x) = 1) = 1
for all
x Zd .
Proof. To begin with, we consider bootstrap percolation with parameter m starting from the
product measure with density q. That is, the initial configuration satisfies
P (0 (z1 ) = 0 (z2 ) = = 0 (zn ) = 1) = q n
for z1 , z2 , . . . , zn Zd
all distinct.
Whether the set of occupied vertices ultimately covers the entire lattice depends on the initial
density and the fact that bootstrap percolation is clearly attractive motivates the introduction of
the following critical value for the initial density:
qc := inf {q [0, 1] : P ( = Zd ) = 1}.
12
m=d
and q > 0.
P (0 (z) = 1) = P (
0 (x) = 1 for all x Hz ) = p2
= q = P (0 (z) = 1),
Acknowledgment. The authors would like to thank Rick Durrett and an anonymous referee who
independently underlined the connection between our model and bootstrap percolation, which has
considerably strengthened our results, as well as another referee for useful comments.
References
[1] Grimmett, G. R. (1989). Percolation. Springer, Berlin.
[2] Harris, T. E. (1972). Nearest neighbor Markov interaction processes on multidimensional lattices. Adv. Math. 9 6689.
[3] Hofbauer, J. and Sigmund, K. (1998). Evolutionary games and population dynamics. Cambridge: Cambridge University Press.
[4] Lanchier, N. (2013). Stochastic spatial models of producer-consumer systems on the lattice.
Adv. Appl. Probab. 45 (2013) 11571181.
[5] Lanchier, N. (2014). Evolutionary games on the lattice: payoffs affecting birth and death rates.
To appear in Ann. Appl. Probab. Available as arXiv:1302.0069.
[6] Liggett, T. M. (1999). Stochastic interacting systems: contact, voter and exclusion processes,
volume 324 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of
Mathematical Sciences]. Springer-Verlag, Berlin.
[7] Maynard Smith, J. and Price, G. R. (1973). The logic of animal conflict. Nature 246 1518.
[8] Richardson, D. (1973). Random growth in a tessellation. Proc. Cambridge Philos. Soc. 74
515528.
[9] Schonmann, R. (1992). On the behavior of some cellular automata related to bootstrap percolation. Ann. Probab. 20 174193.
School of Mathematical and Statistical Sciences
Arizona State University
Tempe, AZ 85287, USA.