Journal of Theoretical Biology: or Givan, Nehemia Schwartz, Assaf Cygelberg, Lewi Stone

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Journal of Theoretical Biology 288 (2011) 21–28

Contents lists available at ScienceDirect

Journal of Theoretical Biology


journal homepage: www.elsevier.com/locate/yjtbi

Predicting epidemic thresholds on complex networks:


Limitations of mean-field approaches
Or Givan, Nehemia Schwartz, Assaf Cygelberg, Lewi Stonen
Biomathematics Unit, Faculty of Life Sciences, Tel Aviv University, Israel

a r t i c l e i n f o abstract

Article history: Over the last decade considerable research effort has been invested in an attempt to understand the
Received 1 September 2010 dynamics of viruses as they spread through complex networks, be they the networks in human
Received in revised form population, computers or otherwise. The efforts have contributed to an understanding of epidemic
30 June 2011
behavior in random networks, but were generally unable to accommodate specific nonrandom features
Accepted 22 July 2011
Available online 7 August 2011
of the network’s actual topology. Recently, though still in the context of the mean field theory,
Chakrabarti et al. (2008) proposed a model that intended to take into account the graph’s specific
Keywords: topology and solve a longstanding problem regarding epidemic thresholds in both random and
SIS model nonrandom networks. Here we review previous theoretical work dealing with this problem (usually
Markov chain
based on mean field approximations) and show with several relevant and concrete counter examples
Poisson process
that results to date breakdown for nonrandom topologies.
Eigenvalue
& 2011 Elsevier Ltd. All rights reserved.

1. Introduction May and Anderson (1988, 1984). For heterogeneous networks, the
relative fraction of nodes having different degrees is referred to as
Under what conditions will a virus or other infectious agent the degree distribution. It is just the probability Pk that a
spread in a complex population network? This question has vexed randomly chosen node in the network has degree k, i.e, Pk ¼
P
epidemiologists, mathematicians and computer scientists alike probðdegðnode iÞ ¼ kÞ. The mean degree of the network is /kS¼
for many decades (Anderson and May, 1991; May and Lloyd, kPk and variance of the degree distribution is varðkÞ ¼ s ¼ /k S
2 2

2001; Pastor-Satorras and Vespignani, 2001; Madar et al., 2004; /kS2 . Consider a large population and let dk be the number of
P
Aparacio and Pascual, 2007; Berchenko et al., 2009). An early individuals in the population that have k contacts, with k dk ¼ L.
result arising from epidemic modeling is based on the so-called Then ðdk =LÞ estimates the degree-distribution Pk.
reproductive number R0, the number of secondary infections a The degree of heterogeneity in the population’s contact struc-
typical infected individual is able to generate (Anderson and May, ture may be gauged by the Coefficient of Variation CV ¼ ðs=/kSÞ.
1991). If a typical infected individual is able to infect on average Equivalently one can write CV 2 ¼ ð/k2 S=/kS2 Þ1. The popula-
more than one other member of the population then R0 41. In tion is assumed to be randomly mixed subjected to the constraint
that case the virus is able to reproduce itself and trigger an so that the degree-distribution is always preserved. For such a
epidemic in the population, allowing it to persist in time for an heterogeneous population, it has been shown that (Anderson and
extensive period. In contrast, if the reproductive number is below May, 1991; Dietz, 1980; May and Anderson, 1988; May and
unity then R0 o1, and the disease will rapidly die out in the Anderson, 1984):
population and an infection free equilibrium will be reached. This
R0 ¼ Rð1þ CV 2 Þ ð1Þ
threshold result assumes that the population is homogeneous and
randomly mixing, whereby an infected individual is equally likely where R is the reproductive number for the equivalent homo-
to come into contact and infect any susceptible present, an geneous population where all individuals have /kS contacts and
assumption that has many limitations. thus CV¼ 0.
This result has been generalized to heterogeneous populations Eq. (1) will be referred to as the Dietz–May formula since both
in which some individuals have more contacts than others. authors (May and Anderson, 1988; May and Anderson, 1984) were
Historically, most notable are the studies of Dietz (1980) and responsible for developing the formulation and applying it in
practice. As before R0 41 implies that an epidemic will ensue,
while R0 o1 implies that the virus rapidly dies out and an infection
n
Corresponding author. free equilibrium is attained. These concepts have proved useful for
E-mail address: lewi@post.tau.ac.il (L. Stone). studying contact networks with power-law distributions that

0022-5193/$ - see front matter & 2011 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jtbi.2011.07.015
22 O. Givan et al. / Journal of Theoretical Biology 288 (2011) 21–28

might typify computer networks and in some cases be appropriate of millions of individuals as is appropriate for large cities. As
for studying the transmission of sexually distributed diseases such, CWF developed a method to approximate the Markov
(Pastor-Satorras and Vespignani, 2001; Lloyd and May, 2001). chain model.
Measuring R0 is often quite complicated, especially in complex In more detail they consider a population network, and define
networks as Aparacio and Pascual (2007) have discussed. Newman, an individual’s neighbors as all members of the population he or
2002 and Cohen et al. (2002) have shown how percolation ideas she can directly contact and transmit the disease. Set b as the
and generating function methods can be used to provide exact probability that an infected individual/node will infect a suscep-
solutions of epidemic models on simple networks and on bi-partite tible neighboring node in the network, and let the probability that
graphs. Their key epidemic threshold result is nevertheless the node-i is infected at time t be given by pi,t. Over one time-step, the
same as Eq. (1) obtained by the different methods. probability that node-i will not receive any infections from its
Considerable work has been invested in exploring these issues neighbors is, according to CWF, given by
in the biology, physics and mathematics literature. Concepts Y Y
zi,t C ½ð1bÞpj,t1 þ ð1pj,t1 Þ ¼ ½1bpj,t1  ð2Þ
taken from the percolation theory continue to play a major role j A Mi j A Mi
in current epidemic network research (Madar et al., 2004; Cohen
et al., 2002; Kenah and Robins, 2007; Parshani et al., 2010). where Mi is the set of all neighbors of node-i. Note Eq. (2) is exact
Moreover, there are many challenging open problems (see the only when it is assumed that the nodes pj,t  1 are independent of
interesting inaugural article of Durrett, 2010). each other. This ‘‘independence assumption’’ is of great impor-
In recent years, there has been considerable interest in under- tance and will be dealt with in detail in what follows. Thus,
standing the way in which the detailed network structure of the according to Chakrabarti et al. (2008) the probability that node-i
population, or its ‘‘topology,’’ might affect the persistence thresh- is healthy at time t is given by
old. That is, does the exact network structure, not just its degree 1pi,t ¼ ð1pi,t1 Þzi,t þ dpi,t1 zi,t ð3Þ
distribution, give extra information from which it is possible to
learn more about the spread of an epidemic? Since many real where d is the probability that an infected node will recover at
networks are non-random and sometimes highly clustered, the time-step t. Note that since recovery is geometrically distributed,
motivation to explore beyond random models is quite justified. the mean infection time is 1/d. This last equation states that node-
Chakrabarti, Wang, Faloutsos (Chakrabarti et al., 2008) intro- i is healthy at time t if it did not receive infections from its
duced a new model, referred to here as the CWF model, which neighbors at t and either node-i was uninfected at time step t 1,
intended to identify exactly how a population’s network structure or was infected at t  1 but was cured at t (Chakrabarti et al.,
controls the epidemic threshold. A very general epidemic threshold 2008). (This last term in Eq. (3), which appears in the CWF model
condition for any arbitrary network was derived. This condition is (Chakrabarti et al., 2008), is problematical as we explain in the
based on the network’s topology as a mean field approximation discussion. It can however be dropped without affecting the
will be elaborated shortly. results of the following stability analysis.) Combining Eqs.
In this paper we show that many of the previous studies (2) and (3) yields the CWF model
Y
contribute to our understanding of epidemic thresholds for random pi,t ¼ 1½1þ ðd1Þpi,t1  ð1bpj,t1 Þ ð4Þ
networks, however for nonrandom network topologies (even j A Mi
regular graphs) accurate predictions of the epidemic threshold
It is clear that the model has an infection free equilibrium in
are hard to come by. We explore the mean field approximation
which pni ¼ 0 8i. (Here, the star notation indicates a state of
formulated by CWF and show that its predictions often break down
equilibrium.) We now proceed to examine this equilibrium’s local
for nonrandom networks. This is because mean-field approxima-
stability. Using vector notation, close to the equilibrium, Eq. (4)
tions fail to take into account the correlations in the state of
may be approximated as
indirect neighbors. Moreover, by mapping one model to another,
we are able to retrieve known theoretical literature results (based ! !
p t ¼ ðð1dÞI þ bAÞ Up t1 ð5Þ
on percolation theory) that contradict the CWF general threshold
where, I is the identity matrix and A is the adjacency matrix of
condition.
binary entries 1,0 representing the connectivity between the
nodes. Thus, the infection free equilibrium ðpni ¼ 0 8iÞ is locally
stable only if
2. The CWF model
ð1dÞ þ br o 1 ð6Þ
CWF (Chakrabarti et al., 2008) assume that the population is where r is the spectral radius of the matrix A. This is because the
divided into two classes: individuals that are Susceptible (S) and Perron–Frobenius theorem ensures that if A is a nonnegative,
individuals that are Infected (I). The model has the classical SIS irreducible matrix then one of its eigenvalues is real, positive and
structure whereby susceptible individuals may become infected greater than or equals to (in absolute value) all other eigenvalues
upon contact with an infected individual. After contracting the (Horn and Johnson, 1985). This eigenvalue is the spectral radius r.
disease an individual recovers after some fixed time period and In terms of the reproductive number, the infection free
becomes susceptible once again, thereby closing the SIS loop. equilibrium is locally stable if
As each individual can be in one of the two states, for a
complex network of N individuals, there are 2N possible different b
R0 ¼ r o1 ð9Þ
states the population may be found in. It is appropriate to d
formulate the model in terms of a Markov chain, but this requires The reproductive number R0 has a simple interpretation.
!
information specifying the probabilities between each of the Returning to Eq. (6) we see that if p 0 is an eigenvector
possible states. In this formulation states correspond to particular corresponding to eigenvalue r of A, the expected number of
!
configurations of the population network, with the configuration newly infected individuals in the next generation p 1 is given by
at each time step dependent on the former time step only. br, while the expected number of recovered individuals is d. Since
However, it is not a simple matter to determine the probabilities the mean infectivity time is 1/d, then (b/d)r should be interpreted
of the 2N  2N transition matrix, which in any case is impractical as the total number of new infections generated in a single time
to work with even when N is modestly large, let alone of the order step multiplied by the actual infectivity period of the disease.
O. Givan et al. / Journal of Theoretical Biology 288 (2011) 21–28 23

Hence R0 is simply the mean number of secondary infections over Thus the threshold condition for local stability as given by Eq.
the infectious period of the disease. (9) becomes
This conforms closely with the conventional view of R0 as the
Rð1þ CV 2 Þ o 1
number of secondary cases that one infected case can produce
when placed in a wholly susceptible population. If it can infect which is the Dietz–May formula given in Eq. (1).
more than one individual on average (R0 41) an epidemic will
ensue otherwise the infection will rapidly die out as the infection
4. Simulations of random networks:
free equilibrium is reached. In what follows, (9) will be referred as
the CWF threshold criterion, since stability of the infection free
We tested the above theoretical results by numerically simu-
equilibrium depends on whether R0 is greater or less than unity.
lating the spread of epidemics on Erdos Renyi networks
In this way R0, as given by Eq. (9), may be used as a reference
(N¼ 50,000) and Regular Random graphs (N¼100,000). For each
frame for testing the CWF threshold
network studied, 1% of the nodes were randomly chosen and
It should be pointed out that the above analysis concerns the
initially infected. Simulation then proceeded in steps of unit time
underlying deterministic mean-field model presented by CWF,
increments. During each time step, an infected node was able to
and this raises two issues. First, for the full stochastic model, in
infect each of its neighbors with probability b. In addition, every
which the mean-field is supposed to mimic, one has to take into
infected node recovered with probability d. In the case of d ¼1,
account the stochastic effects. In particular, if R0 41 then demo-
infected nodes recovered in exactly one time-step. An infection
graphic stochasticity at the initiation of an epidemic when
attempt on an already infected node had no effect; however if a
infectives are in small numbers, can prevent the epidemic from
node recovers, it can be infected by its neighbors within the same
triggering. This is the ‘‘stochastic epidemic theorem’’ (Renshaw,
time step (as simulated by CWF in Chakrabarti et al., 2008 and
1991): even though R0 41 there is a finite probability that the
will be further discussed in the discussion).
epidemic will not trigger. However, if R0 o1 a major epidemic
Simulations were run for 50,000 time steps and were repeated
cannot occur.
100 times with different initial conditions, for different values of
R0 ¼(b/d)r.
Fig. 1 plots the proportion of infected nodes (i.e., the number of
3. Known epidemic thresholds for random networks
nodes infected divided by the total population N) at equilibrium
as a function of R0. One sees the presence of an epidemic
We first consider random networks making use of the results
threshold at R0 ¼0.99, while the CWF prediction is R0 ¼1 (see
from Furedi and Komlos (1981). The latter authors studied
Eq. (9) above). The figure makes clear that the CWF threshold
random, symmetric, N  N matrices in which the elements aij
formula holds for both random networks and regular random
are identically distributed having the same mean m and variance
networks, although the result has been known for decades in this
s2. For such matrices the largest eigenvalue may be approximated
context as given by the Dietz–May formula. We thus understand
by
that the true importance of the CWF threshold formula concerns
P  
ai,j s2 1 nonrandom graphs as treated in detail below.
r ¼ i,j þ þO pffiffiffiffi ð10Þ
N m N
Consider then Erdos Renyi networks, which comprise N nodes 5. Nonrandom graphs
with a probability p, of having an edge between any pair of nodes.
Thus, /aijS¼ m ¼p and var(aij)¼ s2 ¼p(1 p). Therefore Eq. (10) 5.1. One-dimensional chain
may be rewritten as
P   Consider regular graphs in which each node has exactly two
ai,j s2 1 pð1pÞ
r ¼ i,j þ þ O pffiffiffiffi  Np þ ¼ ðN1Þp þ 1 ¼ /kS þ 1p neighbors. This forms a topology often referred to as a ‘‘one-
N m N p
dimensional chain’’ whereby each node-i is connected to node
for large N. i 1 on the left and node iþ 1 on the right, for i¼1yN (Fig. 2a).
Hence, for an Erdos Renyi network, the CFW threshold is based For the particular case of a one-dimensional chain we show
on R0 ¼(b/d)(/kSþ1  p). This coincides with the work of Dietz that it is possible to theoretically determine the threshold via the
and May who, as we saw, argue that percolation theory. To achieve this we first have to show that the
  propagation of a virus in an infinite one-dimensional chain, where
  bNp Npð1pÞ b b 
R0 ¼ R 1 þ CV 2 ¼ 1þ ¼ ððN1Þpþ 1Þ ¼ ð k þ 1pÞ
d N 2 p2 d d
It is of interest to examine regular random graphs in which
each node has the same fixed number of edges k, but the edges
are connected randomly between nodes. A simple calculation
shows that the spectral radius of the adjacency matrix associated
with any regular graph random or otherwise, must be r ¼k
(Restrepo et al., 2007). Thus the CWF threshold for a regular
random network occurs at the point where R0 ¼ (b/d)k is unity.
This threshold is in agreement with Dietz–May formula (taking
CV¼0) and deduced also by Kephart and White (1991).
Results are also available for the more general case of random
networks having arbitrary degree distribution dk. Chung et al.
Fig. 1. (a) Random ER graph with average connectivity degree /dS¼ 4. (b) Regular
(2003) have shown that the spectral radius of the adjacency
random graph with fixed connectivity degree /dS¼ 6. The proportion of infected
matrix associated with these networks is given by nodes at equilibrium is plotted for various values of R0 ¼ ðb=dÞr where d ¼1 is
fixed and R0 is determined by b. The CWF threshold prediction for the topology of
/k2 S
r¼ ¼ /kSð1 þ CV 2 Þ the graph is simply R0 ¼1 in each figure. The simulated thresholds are in good
/kS agreement with the predictions.
24 O. Givan et al. / Journal of Theoretical Biology 288 (2011) 21–28

Fig. 3. (a) Regular graph N ¼8, d ¼4. The nodes are marked by black circles. (b) The
proportion of infected nodes is plotted for various values of R0 ¼ ðb=dÞr where
d ¼ 1 is fixed and R0 is determined by b. The CWF threshold prediction for the
topology of the graph is R0 ¼ 1. The simulated threshold is R0 ¼1.23.

complete and the question whether a percolation exists is


analogous to the question whether the epidemic is still alive. At
Fig. 2. (a) One-dimensional chain. (b) 2D directed lattice. (c) Percolation on a critical threshold pc, long-range connectivity first appears and
directed lattice. (d) The proportion of infected nodes after 50,000 time steps referred to as the percolation threshold as demonstrated in
plotted versus R0 ¼ ðb=dÞr. Simulations were checked for both d ¼1 and d ¼0.8 and Fig. 2c. Here the base of the directed pyramid space can be
the value of R0 is determined by b. Graph size is N ¼15,000. The CWF threshold reached from the origin following directed bonds, indicating that
prediction for the topology of the graph is R0 ¼ 1. The simulated threshold is
R0 E1.29 for d ¼1, as predicted in text.
a chain of infection has percolated through the whole network.
The known threshold for 2D directed bond percolation is
pc ¼0.6447 (Essam et al., 1986; Jensen, 1996). Moreover, the
the probability to recover in a time step is d ¼1, is analogous to spectral radius of the adjacency matrix of a one-dimensional
directed bond percolation in an infinite 2D directed square lattice chain is r ¼ 2 (since it is a regular graph where every node has
(Domany and Kinzel, 1981; Durrett, 1984). A directed square 2 neighbors), Therefore we conclude that the epidemic threshold
lattice is similar to a square lattice but differs in the fact that is bc ¼0.6447 and R0 ¼ 2ðb=dÞ ¼ 2  0:6447  1:29. This known
edges (bonds) always point in the positive direction of the axes as result from the literature is confirmed by the simulation results
shown in Fig. 2b. in Fig. 2d.
Our simulations for the 1D chain epidemic model assume d ¼1 Since the epidemic threshold for the CWF model is R0 ¼ 2
and any infected node will recover after one time step. If we begin ðb=dÞ ¼ 1, Fig. 2d shows that the simulation results are in contra-
simulation by infecting the node O at the origin, then in the next diction with the CWF prediction with some 29–34% deviation.
time step the virus can exist only at the origin’s neighbors. There Hence we show both theoretically (d ¼1) and via simulations
is no possibility for the virus to exist at node O in the next time (d ¼0.8 and 1) that the mean field based CWF threshold is not
step. Observing
pffiffiffi Fig. 2b and c, one can define a time axis by appropriate for the 1-D chain. To show that the problem exists for
t ¼ ðe1 þ e2 = 2Þ where e1 and e2 are unit vectors pointing in the other values of d, and not just for the special case of d ¼1, Fig. 2d
positive directions on the axes of the lattice. Moreover, a hor- gives simulation results for d ¼0.8 that also contradict the CWF
izontal line can be defined by the coordinates m1e1 þm2e2 where prediction.
m1 þm2 ¼ M, M being the line index. Each horizontal line will
differ from its neighbors by integer units of t, i.e., assigned to a 5.2. Regular graphs(d42):
new time step. Notice that at t¼0 only the origin exists while at
t ¼1 only its neighbors exist and the coordinate of the origin node Examine now regular graphs, where it is assumed that each
is vacant. Thus, one can deduce that the structure of the directed node in the graph has the same degree k (k42). Here we consider
2D lattice is adequate to describe the virus migration in a 1D regular graphs where each node is connected to its nearest
chain for d ¼1. We now show that the behavior of the epidemic on neighbor and should thus be considered highly clustered regular
a 1D chain with d ¼1 and a given b are analogous to a specific lattices rather than random networks. We simulated the spread of
bond percolation on a 2D directed lattice. a disease for regular graphs having degree k¼6, and N¼15,000
In general, in bond percolation on a graph, all the bonds in the nodes for periods of 50,000 time steps. As shown in Fig. 3b the
graph are processed in the following way. A bond will be removed infection free equilibrium threshold was found at R0 ¼1.23 and
with probability p or will be kept with probability 1  p. Fig. 2c deviated from the CWF predicted threshold (R0 ¼1) by some 23%.
shows a directed square lattice after bond percolation. In the Hence again, the CWF prediction does not match the results
directed bond percolation model, a typical site O on a square obtained for simple regular graphs.
lattice is chosen to be the origin of percolation, from which
percolation can extend in a directed manner. If the base of the
directed pyramid space can be reached from the origin following 6. Star graphs
the bonds in a directed manner, percolation is said to occur.
In Fig. 2c the bond connecting origin O and node A was Another informative, yet simple example, is the star graph. It is
removed while the bond connecting the origin with node B was defined as a central hub that connects to its N neighbors, but the
retained. This procedure is analogous to an infected origin that in neighbors are not connected to one another (see Fig. 4a). The hub
the next time step will infect one neighbor and not the other. is infected. Neglecting fluctuations and the discreteness of the
Indeed, when we set p to have the value of b the analogy is system, the probability that the hub infects a neighbor, which
O. Givan et al. / Journal of Theoretical Biology 288 (2011) 21–28 25

pj,t  1 (j ¼1..N neighbors of node i) in Eq. (2) are independent of


each other. The accurate way to formulate the disease dynamics is
by examining all possible states. If the system has N nodes, each
with a possible value of 1 (for infected) or 0 (for healthy), then the
system has 2N possible states. The probability of the system to be
in state-k at time t is given by Pk,t . In general, Pk,t has its dynamics
in time as a function of b and d.
The exact derivation of the probability that a node-i will not
receive infections from its neighbors in the next time step should
take into account the probability to be in that state and therefore
Eq. (2) is more exactly written as

0 1
N
X
2 Y
zi,t ¼ @ ð1bnj,k ÞAPk,t1 ð13Þ
k j A Mi

where Pk,t  1 is the probability of the system to be at state-k at


Fig. 4. (a) Star graph, N ¼14 neighbors and one hub. (b) The expected time steps a
disease survives in a star graph according to Eq. (12). 100 simulations were time step t 1, and Mi is the set of all neighbors of node-i. nj,k is
executed for each R0 ¼ ðb=dÞr, with the value of R0 determined by b. Simulations the value (one or zero) of node-j at state-k.
were checked for both d ¼ 1 and d ¼ 0.8. As shown in Appendix B, CWF’s model (Eq. (2)) is based on the
mean field approximation
subsequently reinfects the hub, is b2. Thus the mean number of
* +
secondary infections for an infected hub and a fixed d ¼1 is Y Y
R0 ¼Nb2c . The threshold based on R0 is therefore calculatedpas nj  /nj St ð14Þ
ffiffiffiffi
R0 ¼Nb2c ¼1, since the spectral radius
pffiffiffiffi r of the star graph is N.
j A Mi t j A Mi

The CWF threshold is thus R0 ¼ Nb ¼ 1, and is identical to the


one calculated before. However, we will show that for fixed d ¼1, The brackets represent an averaging over all the 2N possible
not only is this threshold inaccurate, it does not exist. states of the graph at time step t. In fact by application of this last
We proceed by making an exact derivation of the disease approximation, Eq. (13) reduces directly to Eq. (2).
average life time. Suppose the central hub is infected at time step However, correlations between the neighbors may be signifi-
t ¼0. The probability that the center node will not become cant for the propagation of infections through certain networks
infected by its neighbor node-i at time step t ¼2 is (1 b2). Thus such as nonrandom regular graphs. In order to better understand
the probability that the hub will not become infected from any of the impact of correlations we examine two arbitrary nodes, say
2
its neighbors at time step t ¼2 is ð1b ÞN and the probability that the first and second, and evaluate the term /n1n2St which is the
2
it will be infected at time step t ¼2 is ð1ð1b ÞN Þ. Therefore the product of the node values at the fixed time t averaged over
probability L2j that the disease will survive to the 2jth time step is many runs.
If there are no correlations we would expect /n1n2St ¼/n1St
2
L2j ¼ ð1ð1b ÞN Þj ð11Þ /n2St. As an example we show in Fig. 5a and b the difference
Appendix A shows that the probability S2j þ 1 that the disease between the products /n1n2St and /n1St/n2St in a 5000 nodes
will survive to the (2jþ1)th time steps but will not survive to the regular nonrandom graph (as depicted in Fig. 3a) and a regular
2(j þ1)th time step is random graph with connectivity degree k¼6, b ¼0.21 and
d ¼1.5000 simulations was made, which represent a sample of
S2k þ 1 ¼ L2k ð1bÞN ½ð1þ bÞN 1 the 2N possible states.
and therefore the expected time steps a disease will survive in a While for the regular random graph we find only 1% difference
star graph is (shown in Appendix A) on average between the two sides of Eq. (14), the regular
nonrandom graph yields a difference of 56%. Thus the indepen-
2 1 dence assumption holds for random but not nonrandom regular
/jS ¼  1 ð12Þ
ð1b Þ
2 N
ð1 þ bÞN graphs.
Our simulations of virus spread in N¼1000 nodes star graph
for different b values (represented in Fig. 4b), agree with Eq. (12).
Eq. (12) shows that the star graph lacks an epidemic threshold for
the special case of d ¼1 and b o1, since there are no phase
transitions in the Equation. Thus, it appears that epidemic
dynamics for the star graph lacks a threshold for b o1 and d ¼1.
In the absence of a more general analysis, it is difficult to
conjecture about the star threshold, or its absence, for values of d
other than d ¼1. Instead, Fig. 4b provides a simulation for the case
d ¼0.8 and it can be discerned that the scaling is indeed similar to
the known threshold free case of d ¼1, notably without signs of a
phase transition.

6.1. The independence assumption

The so-called independence assumption is a critical assump- Fig. 5. Joint infectivity products /n1n2Sand /n1S/n2S calculated for nodes 1 and
tion used in deriving the CWF model. It assumes that the 2 are plotted versus time for (a) regular random graph and (b) regular nonrandom
probabilities of the ith node. graph. The differences were measured after the system equilibrates.
26 O. Givan et al. / Journal of Theoretical Biology 288 (2011) 21–28

7. Discussion network is a Poisson process. Correct simulation of the system


dynamics should make use of the Gillespie algorithm (Gillespie,
The dynamics of viruses as they spread through complex 1977) that mimics the Poisson process. It is well known that the
networks is a multidisciplinary research field that is currently Gillespie algorithm may exhibit different results than an algo-
receiving great attention (Pastor-Satorras and Vespignani, 2001; rithm, which computes the changes of all of the nodes in parallel
Madar et al., 2004; Berchenko et al., 2009; Lloyd and May, 2001). at each time step.
A general methodology that is often adopted to address the problem With these points in mind, let us return to the CWF model.
is to consider a simple parallel procedure that follows all newly In Eq. (3), which represents the CWF model, multiple events
infected nodes generated at each time step. This is the framework happen in the same dt, i.e., a node can recover and get infected
also adopted by Chakrabarti et al. (2008) when they proposed a once again within the same dt (last term RHS). Of course this is
general model and an epidemic threshold prediction applicable for plausible only for a finite Dt but should not occur if Dt-0 since
complex network topologies. Unlike previous mean field models, the underlying Poisson formulation only allows for one event per
CWF argue that their model gives correct threshold predictions for time step.
any arbitrary topology. In this article we have examined this claim Similarly Eq. (2), which is also a key ingredient of the CWF
in detail and presented various counter examples. Our examples model, explicitly defines zi,t as the probability that a node-i will
range from modified random graphs to regular nonrandom graphs. not receive any infection in the next time step. As can be seen in
In the case of the one-dimensional chain, it was possible to map the Eq. (2), the probability for this event is expressed as multi-
problem to an already existing and well-studied phenomenon: that plication of probabilities of events, namely, that node-i does not
of directed percolation in 2D. Known results from that field get infected from all of its k-neighbors. Thus the formulation
corroborate our simulations showing that the true threshold differs implicitly takes into account the likely possibility that node-i
significantly from the CWF prediction. For the case of the star graph could be infected by more than one of its neighbors in the same
topology it was possible to show analytically that a threshold does time-unit Dt. This is precisely the reason one goes to the trouble
not exist for fixed d ¼1. After a close investigation of the CWF of forming this product of probabilities of no event occurring.
model, a comparison with the exact Markov chain model made it Hence Eq. (2) allows for the fact that multiple events may occur
possible to pinpoint exactly the mean-field approximation used by within the same time step, which can only have meaning when
Chakrabarti et al. (2008) and deduce when this approximation is the time step is a finite Dt. Such a situation should not happen
valid and when it fails. This relates closely to our section discussing if Dt-0.
the ‘‘independence assumption’’. CWF claim that their analysis is valid when Dt-0 in Eq. (3),
The analysis we report here is specifically relevant for the implying that it also holds for the continuous time case. Yet their
discrete time approach used in many network epidemic models difference equation takes into account the probability of multiple
including the work of CWF, but it is nevertheless worthwhile events within the same time interval. This prevents the develop-
clarifying the underlying differences with continuous time ment of a differential equation representing the system as Dt-0.
approaches, which have received interest in recent years (Ganesh Moreover, CWF simulate a parallel algorithm that follows the fate
et al., 2005; Van Mieghem et al., 2009; Peyrard et al., 2008; Peyrard of all of the infected nodes at each iteration rather than an
and Fran, 2005). For the discrete case, when modeling the spread of algorithm that mimics a Poisson process (e.g. the Gillespie algo-
an epidemic through a network we may begin by making hypoth- rithm, Gillespie, 1977) as should be applied when Dt-0.
eses about certain attributes of the nodes dynamics within a finite Some work concerning the continuous time case has already
time interval. One can then attempt to write down the difference appeared in the literature (Ganesh et al., 2005; Van Mieghem
equations that realistically represent the dynamics of the system et al., 2009; Peyrard et al., 2008; Peyrard and Franc, 2005).
using a mean field approximation and probabilities. To achieve However, the results of these modeling approaches are not
this, each node requires its own specific difference equation with directly equivalent to those found in the parallel discrete time
time-step Dt¼ 1 built in a manner that allows for the probability of procedure discussed here. For example, if b and d are not small
multiple events to occur during that time interval. It may be easier then bDt, dDt 51 does not hold for Dt E1, as required in a
to obtain analytic results by transforming the difference equations Poisson formulation, and the analogy between the rate equations
into more approachable differential equations. However, this needs and probability equations breaks down.
careful treatment. Merely exchanging dt for Dt and rewriting the Finally we note that in Appendix C we explore the implications of
difference equation can be problematical. Transforming a differ- the CWF scheme when used to make predictions about appropriate
ence equation into a differential one is a nontrivial problem. The vaccinations and containment strategies in the event of an epidemic.
analytic solution of a differential equation is not always represen- In such cases, the CWF scheme yields misleading predictions.
tative of the ‘‘equivalent’’ difference equation.
Another possibility is to stay with the same features of the
system but to let Dt-0. In this case several considerations need Acknowledgements
to be taken into account.
First, we do not deal with probabilities any more but with We are grateful for support from the European Union FP7, the
rates, i.e., single events have probability to happen within dt Israel Science Foundation and the Israel Ministry of Health.
proportional to dt. One cannot just exchange the probabilities for
rates of the same values.
Second, one should assess the logic of the model and equations Appendix A: star network
as Dt-0. This requires making sure that multiple events cannot
occur within infinitesimal dt so the difference equation is essen- The probability the disease will survive to the 2jth time step is
tially linear in dt. In a correctly posed model, two events should 2
not occur within the same infinitesimal dt otherwise we cannot L2j ¼ Pðtime steps ¼ 2jÞ ¼ ð1ð1b ÞN Þj ðA:1Þ
appeal to the Poisson assumption implicit in the epidemic The probability the disease will survive to the 2jth time steps
modeling approach. but will not survive to the (2j þ1)th time step
Third, when simulating the epidemic dynamics on graph
realizations, since Dt-0, the dynamics of an epidemic on a S2j ¼ L2j ð1bÞN ðA:2Þ
O. Givan et al. / Journal of Theoretical Biology 288 (2011) 21–28 27

and the probability the disease will survive to the (2j þ1)th time Appendix B: the independence assumption
steps but will not survive to the 2(j þ1)th time step is
The exact derivation of the probability that a node-i will not
N N1
S2j þ 1 ¼ L2j ðb ð1bÞN þN b ð1bÞð1bÞN1 þ    receive infections from its neighbors in the next time step is
0 1
  N 
X  2N
X Y
N N @
þ b1 ð1bÞN1 ð1bÞ1 Þ ¼ L2j ð1bÞN bk 1Nk zi,t ¼ ð1bnj,k ÞAPk,t1 ðB:1Þ
1 k¼1
k k j A Mi

¼ L2j ð1bÞN ½ð1 þ bÞN 1 ðA:3Þ Eq. (B.1) can be revised into
0 1
X X 2
X M M
Y
and hence the expected time steps the disease will survive are zi,t ¼ @1b nj,k þ b nj,k nq,k     þ ð1Þ b nj,k APk,t1
k j A Mi j,q A Mi j a q j A Mi
X
1 X
1
* +
/kS ¼ 2kS2k þ ð2k þ1ÞS2k þ 1 ðA:4Þ X 2
X  M
Y
k¼0 k¼0 ¼ 1b nj t1 þ b nj nq t1
    þ ð1ÞM b nj
j A Mi j,q A Mi j a q j A Mi t1
By assigning the expressions of S2k þ 1 and S2k, (A.4) turn to ðB:2Þ
" #
X
1 X
1 where M ¼9Mi9 is the size of the neighbors set.
/kS ¼ ð1bÞN ð2kL2k Þ þ ½ð1 þ bÞN 1 ðð2kþ 1ÞL2k Þ Using the approximation
k¼0 k¼0 * +
Y Y
" # nj  /nj St ðB:3Þ
X
1 X
1
N N N
¼ ð1bÞ 2ð1 þ bÞ kL2k þ ðð1 þ bÞ 1Þ L2k ðA:5Þ j A Mi t j A Mi
k¼0 k¼0
where the product can be over a subset of the neighbors or all
P1 of them.
Since k¼0 L2k is a geometric series
Eq. (B.2) turns into
X
1 X
1
2 1 1 X X Y
ð1ð1b ÞN Þk ¼
2 M
L2k ¼ ¼ zi,t  1b /nj St1 þ b /nj St1 /nq St1     þ ð1ÞM b /nj St1
2 N 2 N
k¼0 k¼0 11 þð1b Þ ð1b Þ j A Mi j,q A Mi j a q j A Mi

ðB:4Þ
and
" # The average value of /njSt  1 is the probability of node-j to be
X X 1
ak
1 1
2 1 @ X 2 infected at time t  1, which is equivalent to the CWFpj,t  1 and
kL2k ¼ kð1ð1b ÞN Þk ¼ h i 1ð1b ÞN
k¼0 k¼0 ln 1ð1b ÞN
2 @a k ¼ 0 therefore zi,t is approximated to
a¼1
Y
2 3 zi,t  ð1bpj,t1 Þ ð2Þ
1 @ 1
i 6 7 j A Mi
¼ h 4 h ia 5
ln 1ð1b ÞN @a 1 1ð1b2 ÞN
2
a¼1 Thus we understand that the mean field approximation

N a suggested by Chakrabarti et al. was to approximate each averaged
2
N
1 1b 1 1b
2
product into a product of averages (as seen in Eq. (B.3)).
¼ 
¼
2N
N a 2 2
1 1 1b
2 1b
a¼1
Appendix C: vaccination strategy
Now, (A.5) can be written as
One of the conclusions Chakrabarti et al. (2008) draw from the
" # CWF model is that the most efficient way to immunize a network,
2 N
ð1ð1b Þ Þ 1
/kS ¼ ð1bÞN 2ð1 þ bÞN 2
þ ðð1 þ bÞN 1Þ 2
is to vaccinate the nodes (i.e. subtract the nodes and their links
ð1b Þ2N ð1b ÞN from the graph) that will cause the most significant decrease in
the spectral radius r of the adjacency matrix A. It is interesting to
2 1 examine this proposition closer using their example as shown
/jS ¼  1 ðA:6Þ
ð1b ÞN
2
ð1 þ bÞN in Fig. 6.

Fig. 6. The ‘‘bar-bell’’ graph discussed by Chakrabarti et al. (2008). Vaccinating any one of the nodes A, A0 , B and C results in the change of the spectral radius r by Dr
demarcated. Since vaccination of node C is associated with largest Dr ¼  0.0315, the CWF method would suggest this to be the most effective strategy when only one node
can be vaccinated.
28 O. Givan et al. / Journal of Theoretical Biology 288 (2011) 21–28

Fig. 7. Modification of the ‘‘bar-bell’’ graph. Node D is added to the right cluster. The effect of vaccination on r is noted next to nodes C and D.

In the case of a budget limited to the cost of the vaccination of Durrett, R., 1984. Oriented percolation in two dimensions. Ann. Probab. 12,
only one node, arguments have been made in the past (Pastor- 999–1040.
Durrett, R., 2010. Some features of the spread of epidemics and information on a
Satorras and Vespignani, 2002; Cohen et al., 2003) for vaccinating random graph. Proc. Natl. Acad. Sci. 107 (10), 4491–4498.
the node with the highest connectivity (several nodes are appro- Essam, J., De’Bell, K., Adler, J., Bhatti, F.M., 1986. Analysis of extended series for
priate for this strategy in this example among them node A0 ). bond percolation on the directed square lattice. Phys. Rev. B 33, 1982–1986.
Furedi, Z., Komlos, J., 1981. The eigenvalues of random symmetric matrices.
However, according to the CFW model the most efficient strategy Combinatorica 1 (3), 233–241.
would be to vaccinate node C since it achieves the maximum Ganesh, A., Massoulie, L., Towsley, D., 2005. The effect of network topology on the
decrease of r (Dr ¼ 0.0315). spread of epidemics. In: Proceedings of the IEEE INFOCOM.
Gillespie, D.T., 1977. Exact stochastic simulation of coupled chemical reactions.
Nevertheless there is a need to treat this conclusion with J. Phys. Chem. 81 (25), 2340–2361.
caution as the following example shows. Suppose a small change Horn, R., Johnson, C., 1985. Matrix Analysis. Cambridge University Press,
is made in the graph presented in Fig. 6, by adding a single node D Cambridge.
Jensen, I., 1996. Temporally disordered bond percolation on the directed square
to one of the clusters (Fig. 7). lattice. Phys. Rev. Lett. 77, 4988.
This minor modification should, on the face of things, not Kenah, E., Robins, J.M., 2007. Second look at the spread of epidemics on networks.
change the fact that vaccinating node C is the most efficient Phys. Rev. E 76, 036113.
Kephart, J.O., White, S.R., 1991. Directed-graph epidemiological models of com-
strategy. But, at the same time, vaccinating the marginal node D,
puter viruses. In: Proceedings of the 1991 IEEE Computer Society Symposium
results in the enhanced reduction of r (Dr ¼ 0.0575) in com- on Research in Security and Privacy. pp. 343–359.
parison to vaccinating node C (Dr ¼ 0.0183). It is highly unlikely Lloyd, A.L., May, R.M., 2001. How viruses spread among computers and people.
Science 292, 1316–1317.
that this small network perturbation should change the vaccina-
Madar, N., Kalisky, T., Cohen, R., ben-Avraham, D., Havlin, S., 2004. Immunization
tion policy to this degree, Thus, in our opinion, it is still an open and epidemic dynamics in complex networks. Eur. Phys. J. 38, 269–276B 38,
question as to whether vaccinating nodes that cause the max- 269–276.
imum reduction of r is in fact the most efficient one. May, R.M., Anderson, R.M., 1984. Spatial heterogeneity and the design of
immunization programs. Math. Biosci. 72, 83–111.
May, R.M., Anderson, R.M., 1988. The transmission dynamics of human immuno-
References deficiency virus (HIV) [and Discussion]. Philos. Trans. R. Soc. London B 321,
565–607. doi:10.1098/rstb.1988.0108.
May, R.M., Lloyd, A.L., 2001. Infection dynamics on scale-free networks. Phys. Rev.
Albert, R., Barabási, A.L., 2002. Statistical mechanics of complex networks. Rev. E 64, 066112.
Mod. Phys. 74, 47–97. doi:10.1103/RevModPhys.74.47. Newman, M.E.J., 2002. Spread of epidemic disease on networks. Phys. Rev. E 66,
Anderson, R.M., May, R.M., 1991. Infectious Diseases of Humans: Dynamics and 016128.
Control. Oxford University Press, Oxford. Parshani, R., Carmi, S., Havlin, S., 2010. Epidemic threshold for the susceptible–
Aparacio, J.P., Pascual, M., 2007. Building epidemiological models from R0: an infectious–susceptible model on random networks. Phys. Rev. Lett. 104,
implicit treatment of transmission in networks. Proc. R. Soc. B 274, 505–512. 258701.
doi:10.1098/rspb.2006.0057. Pastor-Satorras, R., Vespignani, A., 2001. Epidemic spreading in scale free net-
Berchenko, Y., Artzy-Randrup, Y., Teicher, M., Stone, L., 2009. Emergence and size works. Phys. Rev. Lett. 86, 3200–3203.
of the giant component in clustered random graphs with a given degree Pastor-Satorras, R., Vespignani, A., 2002. Immunization of complex networks. Phys.
distribution. Phys. Rev. Lett. 102, 138701. Rev. E 65, 036104.
Chakrabarti, D., Wang, Y., Wang, C., Leskovec, J., Faloutsos, C., 2008. Epidemic Peyrard, N., Franc, A., 2005. Cluster variation approximations for a contact process
thresholds in real networks. ACM Trans. Inf. Syst. Secur. 10 (4), 1–26. living on a graph. Physica A: Stat. Mech. Appl. 358, 575–592.
Chung, F., Lu, L., Vu, V., 2003. Spectra of random graphs with given expected Peyrard, N., Dieckmann, U., Franc, A., 2008. Long-range correlations improve
degrees. Proc. Natl. Acad. Sci. 100 (11), 6313–6318. understanding of the influence of network structure on contact dynamics.
Cohen, R., ben-Avraham, D., Havlin, S., 2002. Percolation critical exponents in Theor. Popul. Biol. 73, 383–394.
scale-free networks. Phys. Rev. E 66, 036113. Renshaw, E., 1991. Modelling Biological Populations in Time and Space. Cambridge
Cohen, R., Havlin, S., ben-Avraham, D., 2003. Efficient immunization strategies for University Press, Cambridge.
computer networks and populations. Phys. Rev. Lett. 91, 247901. Restrepo, J.G., Ott, E., Hunt, B.R., 2007. Approximating the largest eigenvalue of
Dietz, K., 1980. Models for vector-borne parasitic diseases. In: Barigozzi, C. (Ed.), network adjacency matrices. Phys. Rev. E 76, 056119.
Lecture Notes in Biomathematics. Proceedings of the Vito Volterra Symposium Van Mieghem, P., Omic, J.S., Kooij, R.E., 2009. Virus spread in networks. IEEE/ACM
on Mathematical Models in Biology, vol. 39. Springer, Heidelberg, New York, Trans. Networking 17 (1), 1–14.
pp. 264–277.
Domany, E., Kinzel, W., 1981. Directed percolation in 2 dimensions: numerical
analysis and exact solution. Phys. Rev. Lett. 47, 5–8.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy