Stability and robustness analysis of cooperation cycles driven by destructive agents in finite populations

The emergence and promotion of cooperation are two of the main issues in evolutionary game theory, as cooperation is amenable to exploitation by defectors, which take advantage of cooperative individuals at no cost, dooming them to extinction. It has been recently shown that the existence of purely destructive agents (termed jokers) acting on the common enterprises (public goods games) can induce stable limit cycles among cooperation, defection, and destruction when infinite populations are considered. These cycles allow for time lapses in which cooperators represent a relevant fraction of the population, providing a mechanism for the emergence of cooperative states in nature and human societies. Here we study analytically and through agent-based simulations the dynamics generated by jokers in finite populations for several selection rules. Cycles appear in all cases studied, thus showing that the joker dynamics generically yields a robust cyclic behavior not restricted to infinite populations. We also compute the average time in which the population consists mostly of just one strategy and compare the results with numerical simulations.


I. INTRODUCTION
Cooperation is necessary for the appearance of complex structures and higher order selection units from individual behaviors. In this way, cooperation between unicellular life forms gave rise to multicellular organisms, cooperative animals form communities and cooperation between humans gives rise to the complex societies we live in [1]. Thus, it is very important to understand the conditions that allow cooperation to thrive and evolve in nature and society. However, cooperative behaviors are not stable, as they are easily invaded by selfish individuals, who benefit from the interactions with cooperative ones but avoid paying the costs attached to cooperation. The selfish individuals, called defectors, have a higher fitness -a measure of reproductive success-than cooperators in any well mixed population, and therefore will spread under the action of natural selection, leading to the extinction of the cooperative behavior [2], and to populations where nobody benefits from altruistic acts (the "tragedy of the commons" [3]).
In the last decades the study of the Public Goods (PG) game, a mathematical metaphor of a common enterprise, in which cooperative individuals invest -pay a costand share the benefits with all the players involved in the game, has led to the discovery of some mechanisms that allow cooperation to thrive, as introducing reputation [4], diversity in number and size of groups [5], linking group size and payoffs [6], or the inclusion of spatial structure and conditional behaviors [7][8][9]. Furthermore, it has been proven that the introduction of some behavioral types in well-mixed population, as punishers [10,11] or individuals which do not participate in the PG and instead receive a fixed benefit (the so-called 'loners') [12], may promote cooperation. However, the only behavioral type found so far that allows for the emergence of stable cycles in the presence of mutations is the so called joker strategy [13]. Jokers do not take part of the benefits produced by the PG game and instead they provoke a damage to the common enterprise, thus affecting every individual involved in it. Surprisingly, the effect of these indiscriminate destructive agents on the dynamics is the induction of robust evolutionary stable limit cycles of cooperation, defection and destruction. A cyclic dynamics can also be found when loners are involved. However, jokers induce cycles that are dynamically different from those found with loners [12], because the latter are neutrally stable (i.e, have no fixed amplitudes) and disappear in the presence of mutations or structural noise.
The inclusion of destructive agents in the PG game is motivated by the observation that, in nature and society, the appearance of a risk, as those created by common enemies, predators or simply dangerous situations, may induce cooperation among the victims [14,15].
In a previous work [13], the stability of the evolutionary cycles induced by jokers was proven for infinite populations through the analysis of the replicator mutator equation. Here, we extend the study to finite populations and analyze the dynamics for different updating methods [16], i.e., different selection dynamics for the population, in order to analyze if robust cycles are also found. The conclusion we reach from this study is that the robust cycles obtained using the replicator-mutator equation are not just restricted to this particular dynamics but are a generic feature of the joker model. The paper is organized as follows. In section II we explain the PG game and review the dynamics in infinite populations. Section III provides the stochastic equations describing the evolutionary dynamics for finite populations. In section IV we analyze the joker dynamics in finite populations using different selection dynamics in order to check the existence of cycles. Section V is devoted to conclusions.

II. EVOLUTIONARY CYCLES INDUCED BY JOKERS IN PUBLIC GOOD GAMES
In each PG game a number n of individuals is randomly chosen from the entire population. Each cooperative individual contributes to the common enterprise at a cost c to itself, which yields a benefit b = rc (r > 1) equally distributed between all players; defectors free-ride the public good at no cost, thus obtaining a higher benefit than cooperators; jokers do not participate on the benefits, and each one provokes a damage −d < 0 to be shared by all individuals engaged in the PG game. Note that the cost paid by C players can be set to c = 1 without loss of generality: all other payoffs are given in units of c.
Let us call 0 ≤ m ≤ n the number of cooperative participants, 0 ≤ j ≤ n the number of jokers, n− m− j ≥ 0 the number of defectors and S = n − j the number of non-jokers, i.e., the number of individuals involved in the PG game, and that potentially benefit from the PG. Then the payoff of a defector will be Π D (m, j) = (rm − dj)/S, and that of a cooperator Π C = Π D − 1; as previously stated, in each interacting group defectors will always do better than cooperators. Jokers' payoff is always 0.
In summary, we have A simple invasion analysis [13] shows that the PG thus defined determines a tragedy of the commons [3] for r < r max = n(M − 1)/(M − n), where M is the population size. For infinite populations, the latter condition reduces to r < n, as usual in PG games. Therefore, a mixed population of cooperators and defectors with r < r max will end up composed of defectors only. The interesting question is then if jokers may prevent the extinction of cooperators and under which conditions. In Ref. [13], it was shown that, in the region of interest, namely 1 < r < r max , d > 0, the system exhibits: (a) joker-cooperator bistability for 1 + d/(M − 1) < r < 1 + (n − 1)d, (b) joker dominance for r < 1 + d/(M − 1) and, most importantly, (c) a rock-paper-scissor (RPS) cyclic dominance of the three strategies for This condition expresses the fact that a single cooperator gets a positive payoff in spite of the damage inflicted by n − 1 jokers, which allows cooperators to thrive in the damaging environment that represents a population of jokers and re-establish a cooperative environment. In Ref. [13], we analyzed just one dynamics, namely the replicator-mutation dynamics, and showed that it produces stable (robust) limit cycles C→D→J→C when mutations are rare, and stable coexistence for high mutation rates (Fig. 1). In the following we will analyze the dynamics of PG with jokers in finite populations under different update rules, check the appearance of cycles and calculate the average time spent in each homogeneous state, in order to decide which dynamics better promotes the survival of cooperation.

III. STOCHASTIC DYNAMICS IN FINITE POPULATIONS
The deterministic evolution represented by the replicator-mutator equation is an idealization of the system behavior in the limit of infinite populations. To get a deeper insight into the model we need to address the question what happens when populations have a finite size M . To begin with we need to describe the microscopic dynamics in more detail. Hauert et al. [11] have proposed a protocol in which random selections of n players are gathered together to play the game. After receiving their corresponding payoffs the group dissolves and a new one is sampled. This sampling is made a sufficient number of times so that on average each player receives a payoff proportional to the mean payoff she can obtain given the composition of the population.
Suppose there are m cooperators, j jokers, and M − m − j defectors in the population. The probability that the sampling of n individuals contains k cooperators, l jokers, and n − k − l defectors is given by the extended hypergeometric distribution ( The average payoff of strategy X within this population, P X (m, j), is obtained by averaging formulae (1) with this probability distribution. This is done in Appendix A, where explicit expressions for P C (m, j) and P D (m, j) are obtained -obviously P J (m, j) = 0 irrespective of the population composition. Once payoffs are obtained evolution proceeds by imitation. Different payoff-dependent updating rules have been proposed in the literature [16]. All of them describe a process of birth and death which is defined by the transition probability T (m ′ , j ′ |m, j) from a population with composition (m, j) to another one with compo- The simplexes describe the replicator-mutator dynamics for a population of cooperators, defectors and jokers with parameter values satisfying n > r > 1 + d(n − 1), for which a rock-paper-scissor dynamics is expected. For small mutation rates, the only equilibrium is a repeller (white dot in (a),(b)), and trajectories end up in a stable limit cycle of decreasing amplitude with increasing µ (black line); when mutations reach a critical value µc, the system undergoes a Hopf bifurcation and a stable mixed equilibrium appears (black dot in (c)). Thus the presence of jokers induces periodic bursts of cooperation for low mutation rates, and stable coexistence for high µ. Parameters: n = 5, r = 3, d = 0.4, µ is (a) 0.001, (b) 0.005, (c) 0.05. (Images generated using a modified version of the Dynamo Package [17]).
If now Π(m, j; t) denotes the probability that the population has a composition given by (m, j) at time t, then this probability evolves according to If we introduce matrix T, with elements T (m, j; m ′ , j ′ ) [(m, j) is the "row index" and (m ′ , j ′ ) the "column index"] defined as and vectors Π(t), with elements Π(m, j; t) [where (m, j), (m ′ , j ′ ) ∈ P], then Eq. (5) can be cast in matrix notation simply as A. Stationary state If the process undergoes mutations then matrix T is ergodic and Eq. (7) has got a unique stationary state, π, which is obtained by solving the linear system In the absence of mutations, though, there are three absorbing states corresponding to the three homogeneous populations. A homogeneous population remains invariant because the imitation process cannot change its composition. We will denote these vectors e C , e D , e J , the index denoting the strategy of the homogeneous population. Clearly e C (m, j) = δ m,M δ j,0 , e D (m, j) = δ m,0 δ j,0 , e J (m, j) = δ m,0 δ j,M .

B. Infinitely small mutation rate
After every imitation attempt (whether successful or not), individuals can randomly mutate their strategy. With probability 2µ the actor of the imitation event changes its current strategy into one of the other two equally likely. Parameter µ is referred to as the mutation ratio. In this section we will be concerned with mutation rates µ ≪ 1.
In the the limit µ → 0 + the stationary probability distribution must be a linear combination of the stationary vectors of the process without mutations, so in principle, taking the limit should provide the coefficients α X of this linear combination, but this limit cannot be obtained directly from Eq. (8). There is an alternative though. It has been proven [18] that the µ → 0 + limit of this process is equivalent to another process with three states, C, D, J, in which the transition probability between X and Y is equal to the probability that a single mutant of type Y invades an otherwise homogeneous population of X individuals, thus transforming it into a homogeneous population of Y individuals. Intuitively, this is tantamount to saying that mutations are so rare that the ultimate fate of a mutant is decided before the next mutation occurs. The stationary vector in this space, provides the values of the coefficients α X in (9). Following [11], let ρ YX denote the probability that a single Y mutant takes over the population made of the mutant and M − 1 individuals of type X. Then the transition probability of going from state X to a different state Y in the three-states Markov chain defined above will be r YX = ρ YX µ. Introducing R = (r YX ) so that the elements in each column add up to one (this fixes the diagonal of the matrix), we can rewrite this matrix as Vector α is then the solution of the linear system Qα = 0. A little bit of algebra leads to the result with A chosen so as to fulfill

C. Finite mutation rates
If the mutation rate is not zero the Markov chain is ergodic and the stationary state can be obtained by solving numerically Eq. (8). This is accomplished with better accuracy by splitting with T 0 the transition matrix in the absence of mutations-i.e., with transitions describing only the imitation process. Then π 1 is the solution of the linear system

D. Imitation rules
In order to specify the transition matrix T we need to describe the imitation process. Of the many different rules applied in the literature [16] we have chosen the three most commonly employed: unconditional imitation, proportional update, and a Moran process. In all cases the corresponding matrix T is obtained in Appendix B.
Under unconditional imitation two players are chosen at random among the population, one as the focal player and the other one as the model to imitate. The focal player compares both payoffs and changes her strategy to that of the model if the latter has a higher payoff. In this case, the strategy with the highest fitness never changes except by mutation, which is the only source of stochasticity in this rule.
Appendix C discusses the value α for this update rule. There are two possibilities: Proportional update is entirely similar to unconditional imitation with the exception that imitation occurs with probability proportional to the payoff difference between the model and the focal players. For this reason the values of α for this rule are the same as those for unconditional imitation.
In a Moran process a strategy is chosen to be imitated (or reproduced) with a probability proportional to its population-dependent fitness. The player who imitates (or is replaced by the offspring of) this selected player is randomly chosen from the rest of the population. The only drawback of this rule is that fitnesses must be positive for it to make sense, so they cannot be directly the payoffs of the game, because they can take negative values. A standard mapping between payoff and fitness is obtained by introducing the selection strength s [19]. This weights the contribution of the game to the total fitness of the strategy as F = 1 − s + sP , with P the average payoff. Bounding the value of s we can force F to be positive. The Moran process thus described defines a birthdeath process with two absorbing states, and the corresponding probabilities ρ XY are obtained via standard formulae (see Appendix C).

IV. RESULTS: ROBUSTNESS OF THE CYCLES USING DIFFERENT SELECTION DYNAMICS
In this section we compare the results of agent-based simulations with those obtained by solving the stationary equation (8). Simulations implement the following stochastic process. We start with a population of M individuals with equal amounts of C, D and J players. Then: 1. Assuming that every time step each individual plays many rounds of the game with different, randomly gathered groups of n players, the payoffs they obtain will be proportional to the average payoffs, as calculated in Appendix A. Thus we assume that these expressions provide the payoffs each individual gains every time step.
2. These payoffs are used to update the population according to the corresponding imitation rule. We implement the three rules described in Sec. III D.
3. With probability µ each newborn mutates to a different strategy (any of the other two with equal probability).
A quite general result is that, irrespective of the population size, at low mutation rates simulations show patterns of cyclic invasions C→D→J→C (see Fig. 2). These patterns resemble the limit cycles observed in the replicator dynamics, i.e., for infinite populations [13] (c.f. Fig. 1).
Roughly speaking we can distinguish three regimes of mutations. In the low mutation regime the system spends most of the time in homogeneous states, and the dynamics of the system is well described by the µ → 0 limit of the stationary probability distribution π. This can be clearly seen in Fig. 3. The dashed-dotted curves in Figs. 3(a), (c) and (e) represent the fraction of time spent in transients when a homogeneous population is replaced by another one arisen as the result of mutations. This fraction is very small for µ 10 −5 -10 −4 , depending on the imitation rule. For larger mutation rates (10 −5 -10 −4 µ 10 −3 -10 −2 ) the system spends as much time in homogeneous populations as in mixed transient states. This is the regime displayed in Fig. 2, where cycles are clearly defined even though for some imitation rules (particularly so for proportional update) certain homogeneous populations that are hardly ever reached [ Fig. 2(b) shows burst of cooperators which never reach a fraction higher than 80% of the population].  Figure 3(a) shows this probability as a function of the joker's inflicted damage d. As long as d > 0 and r > 1 + (n − 1)d we find each strategy equally likely. For r < 1 + (n − 1)d a homogeneous population of jokers cannot be invaded because this is the only absorbing state of the Markov chain for µ = 0. For d = 0 jokers do not inflict damage. Then the system spends most of the time in a homogeneous population of defectors. However, random drift allows for occasional invasions by jokers, who are subsequently wiped out by cooperators, who in its turn get replaced again by defectors. Figure 4(b) illustrates a typical realization exhibiting one of these turn-overs.
As of proportional update, its main difference with unconditional imitation is its being a truly probabilistic rule, in which individuals only imitate higher payoffs with a certain probability. Although in the small mutations regime this leads to the same probability of mutual invasion of strategies as for unconditional imitation, the stochastic nature of this rule renders much longer invasion times. This can be clearly appreciated in Fig. 2.
Another effect of stochasticity is that the time spent in transient states is also longer, thus shrinking the low mutations regime by more than one order of magnitude [compare Figs. 3(a) and (c)]. The effect is particularly notorious for jokers, who take a long time to invade defectors, thus extending the life time of defective populations. This effect is illustrated in Fig. 5, which represents a typical realization of an agent-based simulation.
The Moran process is the randomest of the three evolutionary dynamics because even strategies not performing very well have a chance to get imitated. The effect is more noticeable the smaller the population. This dynamics imposes an upper limit to the selection strength s (see Sec. III D) and the probabilities to find the population in each of the three homogeneous states depend on the parameters of the game and on s in a nontrivial way (see Sec. C 2). These probabilities are represented in Fig. 6(a) as a function of s. The theoretical predictions of Sec. C 2 agree with the simulations. This figure shows that cooperation is highly promoted for small s(0.005 < s < 0.05). In this limit cooperative populations are found with almost 50% probability. This probability decreases down to around 25% for larger s. Figure 6(b) shows a typical realization of this process, exhibiting a defining feature of this process, namely the frequent failures of attempted invasions.
Whichever the update rule, when mutation rates are not small the system is better characterized by providing the stationary probability distribution π, as obtained from Eq. (18)). The results are plotted in Fig. 7 for all three imitation rules and different mutation rates µ. with the cyclic behavior of the system. However, for high µ the probability peaks around a point. This point is interior for the most stochastic rules, but corresponds to a defective population for unconditional imitation. The simplexes are obtained for the same parameter values as used in Fig. 1, so a direct comparison with the results of the replicator dynamics can be made.

V. DISCUSSION AND CONCLUSIONS
In this paper we have proven that the existence of jokers, i.e., individuals whose purely destructive behavior is directed against the common enterprises represented by PG games, allows for the emergence of robust evolutionary cycles in finite populations regardless of the updating method chosen. Together with a previous report [13] on the existence of limit cycles for infinite populations evolving via a replicator-mutator dynamics, our present results show that limit cycles are a generic feature of the dynamics generated by destructive agents, not restricted to a particular selection dynamics. In fact, this is a dynamical feature that makes this model different from other three-player games like that of loners [12], for which cycles are structurally unstable and their existence strongly depends on the absence of mutations and other kinds of perturbations.
In a recent paper [20] Mobilia has shown that, of the three possible outcomes of the replicator equation for rock-paper-scissors games [2], namely (a) orbits are attracted towards an asymptotically stable mixed equilibrium, (b) orbits cycle around a neutrally stable mixed equilibrium, and (c) orbits go away from an unstable mixed equilibrium and approach the heteroclinic orbit defined by the border of the simplex (the case of the Joker game), adding mutations between the three strategies merges cases (a) and (b), both of which yielding a mixed equilibrium. Oscillations disappear in these two cases. The Loners game belongs to class (b). In contrast, the Joker case analysed in this paper belongs to class (c), with mutations generating an attractive, stable limit cycle. In this case the dynamics oscillates between the three strategies with well-defined and robust oscillations. We are not aware of any other game for which the inclusion of a simple behavioral type (jokers do not need memory, have no especial recognition abilities, and do not rely on any reputation generated along the game) leads to cycles which are robust to perturbations and have a well defined period and amplitude irrespective of the initial fractions of players.
We have expanded here these results and have proven that the oscillatory dynamics does not occur only for infinite (or very large) populations evolving under a replicator dynamics, but also in the case of finite populations and for different update rules. We have analyzed unconditional imitation, proportional update, and a Moran process. In all cases the system exhibits finite time lapses in which most of the population is composed by cooperative individuals, finding that the Moran process for low (but not extremely low) selection pressures is the most favorable to cooperation-as the system spends 50% of the time in cooperative states. Under unconditional imitation the system spends one third of the time in cooperative states, whereas the more stochastic nature of proportional update favors defection due to the slower invasion of jokers, and thus the system stays longer in defective states-especially so for high mutation rates.
Let us note that, if the damage d inflicted by jokers is zero, jokers are not able to overcome defectors and oscillations are supressed. The system ends up in a steady population where cooperation becomes extinguished, both with and without mutations [13]. Indeed, this case is identical to the loner model when the benefit obtained by loners is also zero, situation in which both jokers and loners become simply non-participants in the game with the only effect of reducing the effective number of players [12,13]. We have shown here that for finite populations and d = 0 random drift allows for bursts in which the system spends some time in fully cooperative states, but that the happening probability of such events is very low.
The existence of damaging agents, which are able to destroy the defective populations and lead to a state without cooperators and defectors, gives cooperators the chance to re-build cooperative enterprises, and thus promotes cooperation. This result, as well as modifications of the model presented here, might be interesting in the study of human evolution, where examples of destructive periods can be found along history as a result of revolutions or wars. It has been suggested that these destructive periods take place whenever a society reaches a point where the public goods fall below a certain threshold [21]. The Joker game shares this feature. Modifications of the model presented here may thus help explain not only how cooperation in animals arises whenever there is a risk or they face a predator, but also provide insights into the evolutionary cycles observed in human society.
Substituting factorization (A2) into (A1) and making use of this identity we readily obtain A new identity, namely allows us to do the sum It will prove convenient to introduce the function This allows us to write Finally, P J (n, j) = 0 because jokers get zero regardless of the composition of the population. n D 0,0 = M − j − m, etc.). On the other hand, in order to account for mutations we must define ω XX ǫ1,ǫ2 = n X ǫ1,ǫ2 (n X ǫ1,ǫ2 − 1) M (M − 1) . (B4)

Proportional update
Similarly to the previous rule, where Ψ(x) = x/Ω if x > 0 and 0 otherwise, Ω being a constant ensuring that Ψ P X ǫ1,ǫ2 − P Y ǫ1,ǫ2 ≤ 1 (typically Ω is chosen as the largest possible payoff difference). As in the previous rule ω XX ǫ1,ǫ2 is given by (B4).

Moran process
In this case payoffs are replaced by fitnesses F X ǫ1,ǫ2 = 1 − s + sP X ǫ1,ǫ2 (see Sec. III D). Let us introduce the total fitness of the population The Moran rule specifies that a player is chosen for reproduction proportional to its fitness and the offspring replaces another randomly chosen individual from the rest of the population. So if X = Y, and Appendix C: Stationary probabilities in the weak mutation limit

Unconditional imitation and proportional update
According to the payoffs obtained in Appendix A: (i) P D (m, 0) > P C (m, 0) for all 0 < m < M , so D always invades C, but C never invades D.
(ii) P C (m, M − m) > P J (m, M − m) for all 0 < m < M , provided r > 1 + (n − 1)d (the rock-paper-scissors condition), so under this assumption C always invades J, but J never invades C. (iii) P J (0, j) > P D (0, j) for all 0 < j < M , so J always invades D, but D never invades J.

Moran Process
The Moran process for a population with two strategies defines a birth-death process with two absorbing states. The details of the calculation of ρ YX can be found in [11] and follow standard formulae for this kind of processes [22]. Summarizing, if we denote P YX (m) the payoff received by a type Y individual when the population is made of m Y individuals and M − m X individuals, then where q 0 = 1 and (C6) Payoffs P XY (m) and P YX (m) are obtained from the formulae of Appendix A. The maximum value of the selection strength s is given by . (C7)