{"title": "Anytime Influence Bounds and the Explosive Behavior of Continuous-Time Diffusion Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 2026, "page_last": 2034, "abstract": "The paper studies transition phenomena in information cascades observed along a diffusion process over some graph. We introduce the Laplace Hazard matrix and show that its spectral radius fully characterizes the dynamics of the contagion both in terms of influence and of explosion time. Using this concept, we prove tight non-asymptotic bounds for the influence of a set of nodes, and we also provide an in-depth analysis of the critical time after which the contagion becomes super-critical. Our contributions include formal definitions and tight lower bounds of critical explosion time. We illustrate the relevance of our theoretical results through several examples of information cascades used in epidemiology and viral marketing models. Finally, we provide a series of numerical experiments for various types of networks which confirm the tightness of the theoretical bounds.", "full_text": "Anytime In\ufb02uence Bounds and the Explosive\n\nBehavior of Continuous-Time Diffusion Networks\n\n1CMLA, ENS Cachan, CNRS, Universit\u00b4e Paris- Saclay, France, 21000mercis, Paris, France\n\nKevin Scaman1\nNicolas Vayatis1\n{scaman, lemonnier, vayatis}@cmla.ens-cachan.fr\n\nR\u00b4emi Lemonnier1,2\n\nAbstract\n\nThe paper studies transition phenomena in information cascades observed along\na diffusion process over some graph. We introduce the Laplace Hazard matrix\nand show that its spectral radius fully characterizes the dynamics of the conta-\ngion both in terms of in\ufb02uence and of explosion time. Using this concept, we\nprove tight non-asymptotic bounds for the in\ufb02uence of a set of nodes, and we\nalso provide an in-depth analysis of the critical time after which the contagion be-\ncomes super-critical. Our contributions include formal de\ufb01nitions and tight lower\nbounds of critical explosion time. We illustrate the relevance of our theoretical re-\nsults through several examples of information cascades used in epidemiology and\nviral marketing models. Finally, we provide a series of numerical experiments for\nvarious types of networks which con\ufb01rm the tightness of the theoretical bounds.\n\n1\n\nIntroduction\n\nDiffusion networks capture the underlying mechanism of how events propagate throughout a com-\nplex network. In marketing, social graph dynamics have caused large transformations in business\nmodels, forcing companies to re-imagine their customers not as a mass of isolated economic agents,\nbut as customer networks [1]. In epidemiology, a precise understanding of spreading phenomena\nis heavily needed when trying to break the chain of infection in populations during outbreaks of\nviral diseases. But whether the subject is a virus spreading across a computer network, an innova-\ntive product among early adopters, or a rumor propagating on a network of people, the questions\nof interest are the same: how many people will it infect? How fast will it spread? And, even more\ncritically for decision makers: how can we modify its course in order to meet speci\ufb01c goals? Sev-\neral papers tackled these issues by studying the in\ufb02uence maximization problem. Given a known\ndiffusion process on a graph, it consists in \ufb01nding the top-k subset of initial seeds with the high-\nest expected number of infected nodes at a certain time distance T . This problem being NP-hard\n[2], various heuristics have been proposed in order to obtain scalable suboptimal approximations.\nWhile the \ufb01rst algorithms focused on discrete-time models and the special case T = +\u221e [3, 4],\nsubsequent papers [5, 6] brought empirical evidences of the key role played by temporal behavior.\nExisting models of continuous-time stochastic processes include multivariate Hawkes processes [7]\nwhere recent progress in inference methods [8, 9] made available the tools for the study of activity\nshaping [10], which is closely related to in\ufb02uence maximization. However, in the most studied case\nin which each node of the network can only be infected once, the most widely used model remains\nthe Continuous-Time Information Cascade (CTIC) model [5]. Under this framework, successful\ninference [5] as well as in\ufb02uence maximization algorithms have been developed [11, 12].\nHowever, if recent works [13, 14] provided theoretical foundations for the inference problem, as-\nsessing the quality of in\ufb02uence maximization remains a challenging task, as few theoretical results\nexist for general graphs. In the in\ufb01nite-time setting, studies of the SIR diffusion process in epidemi-\nology [15] or percolation for speci\ufb01c graphs [16] provided a more accurate understanding of these\nprocesses. More recently, it was shown in [17] that the spectral radius of a given Hazard matrix\n\n1\n\n\fn).\n\nplayed a key role in in\ufb02uence of information cascades. This allowed the authors to derive closed-\n\u221a\nform tight bounds for the in\ufb02uence in general graphs and characterize epidemic thresholds under\nwhich the in\ufb02uence of any set of nodes is at most O(\nIn this paper, we extend their approach in order to deal with the problem of anytime in\ufb02uence bounds\nfor continuous-time information cascades. More speci\ufb01cally, we de\ufb01ne the Laplace Hazard matrices\nand show that the in\ufb02uence at time T of any set of nodes heavily depends on their spectral radii.\nMoreover, we reveal the existence and characterize the behavior of critical times at which super-\ncritical processes explode. We show that before these times, super-critical processes will behave\nsub-critically and infect at most o(n) nodes. These results can be used in various ways. First, they\nprovide a way to evaluate in\ufb02uence maximization algorithms without having to test all possible set\nof in\ufb02uencers, which is intractable for large graphs. Secondly, critical times allow decision makers\nto know how long a contagion will remain in its early phase before becoming a large-scale event,\nin \ufb01elds where knowing when to act is nearly as important as knowing where to act. Finally, they\ncan be seen as the \ufb01rst closed-form formula for anytime in\ufb02uence estimation for continuous-time\ninformation cascades. Indeed, we provide empirical evidence that our bounds are tight for a large\nfamily of graphs at the beginning and the end of the infection process.\nThe rest of the paper is organized as follows. In Section 2, we recall the de\ufb01nition of Information\nCascades Model and introduce useful notations. In Section 3, we derive theoretical bounds for the\nin\ufb02uence. In Section 4, we illustrate our results by applying them on speci\ufb01c cascade models. In\nSection 5, we perform experiments in order to show that our bounds are sharp for a family of graphs\nand sets of initial nodes. All proof details are provided in the supplementary material.\n\n2 Continuous-Time Information Cascades\n\nInformation propagation and in\ufb02uence in diffusion networks\n\n2.1\nWe describe here the propagation dynamics introduced in [5]. Let G = (V,E) be a directed network\nof n nodes. We equip each directed edge (i, j) \u2208 E with a time-varying probability distribution\npij(t) over R+ \u222a {+\u221e} (pij is thus a sub-probability measure on R+) and de\ufb01ne the cascade\nbehavior as follows. At time t = 0, only a subset A \u2282 V of in\ufb02uencers is infected. Each node i\ninfected at time \u03c4i may transmit the infection at time \u03c4i + \u03c4ij along its outgoing edge (i, j) \u2208 E with\nprobability density pij(\u03c4ij), and independently of other transmission events. The process ends for a\ngiven T > 0.\nFor each node v \u2208 V, we will denote as \u03c4v the (possibly in\ufb01nite) time at which it is reached by the\ninfection. The in\ufb02uence of A at time T , denoted as \u03c3A(T ), is de\ufb01ned as the expected number of\nnodes reached by the contagion at time T originating from A, i.e.\n\n\u03c3A(T ) = E[\n\n1{\u03c4v\u2264T}],\n\n(1)\n\nwhere the expectation is taken over cascades originating from A (i.e. \u03c4v = 0 \u21d4 1{v\u2208A}).\nFollowing the percolation literature, we will differentiate between sub-critical cascades whose size\nis o(n) and super-critical cascades whose size is proportional to n, where n denotes the size of\nthe network. This work focuses on upper bounding the in\ufb02uence \u03c3A(T ) for any given time T and\ncharacterizing the critical times at which phase transitions occur between sub-critical and super-\ncritical behaviors.\n\n(cid:88)\n\nv\u2208V\n\n2.2 The Laplace Hazard Matrix\n\nWe extend here the concept of hazard matrix \ufb01rst introduced in [17] (different from the homonym\nnotion of [13]), which plays a key role in the in\ufb02uence of the information cascade.\nDe\ufb01nition 1. Let G = (V,E) be a directed graph, and pij be integrable edge transmission prob-\npij(t)dt < 1. For s \u2265 0, let LH(s) be the n \u00d7 n matrix, denoted as the\n\nabilities such that(cid:82) +\u221e\n\nLaplace hazard matrix, whose coef\ufb01cients are\n\n0\n\n(cid:40) \u2212\u02c6pij(s)\n\n(cid:16)(cid:82) +\u221e\n\n0\n\nLHij(s) =\n\n(cid:17)\u22121\n\n(cid:16)\n\n1 \u2212(cid:82) +\u221e\n\n0\n\npij(t)dt\n\nln\n\npij(t)dt\n\n(cid:17)\n\n0\n\n2\n\nif (i, j) \u2208 E\notherwise\n\n.\n\n(2)\n\n\f(cid:82) +\u221e\n\nwhere \u02c6pij(s) denotes the Laplace transform of pij de\ufb01ned for every s \u2265 0 by \u02c6pij(s) =\npij(t)e\u2212stdt. Note that the long term behavior of the cascade is retrieved when s = 0 and\n\n0\ncoincides with the concept of hazard matrix used in [17].\nWe recall that for any square matrix M of size n, its spectral radius \u03c1(M ) is the maximum of the\nabsolute values of its eigenvalues. If M is moreover real and positive, we also have \u03c1( M +M(cid:62)\n) =\nsupx\u2208Rn\n\nx(cid:62)M x\nx(cid:62)x .\n\n2\n\n2.3 Existence of a critical time of a contagion\n\nIn the following, we will derive critical times before which the contagion is sub-critical, and above\nwhich the contagion is super-critical. We now formalize this notion of critical time via limits of\ncontagions on networks.\nTheorem 1. Let (Gn)n\u2208N be a sequence of networks of size n, and (pn\nij)n\u2208N be transmission proba-\nbility functions along the edges of Gn. Let also \u03c3n(t) be the maximum in\ufb02uence in Gn at time t from\na single in\ufb02uencer. Then there exists a critical time T c \u2208 R+ \u222a{+\u221e} such that, for every sequence\nof times (Tn)n\u2208N:\n\n\u2022 If lim supn\u2192+\u221e Tn < T c, then \u03c3n(Tn) = o(n),\n\u2022 If \u03c3n(Tn) = o(n), then lim inf n\u2192+\u221e Tn \u2264 T c.\n\nMoreover, such a critical time is unique.\n\nIn other words, the critical time is a time before which the regime is sub-critical and after which no\ncontagion can be sub-critical. The next proposition shows that, after the critical time, the contagion\nis super-critical.\nProposition 1. If (Tn)n\u2208N is such that lim inf n\u2192+\u221e Tn > T c, then lim inf n\u2192+\u221e \u03c3n(Tn)\nthe contagion is super-critical. Conversely, if (Tn)n\u2208N is such that lim inf n\u2192+\u221e \u03c3n(Tn)\nlim supn\u2192+\u221e Tn \u2265 T c.\nIn order to simplify notations, we will omit in the following the dependence in n of all the variables\nwhenever stating results holding in the limit n \u2192 +\u221e.\n\nn > 0 and\nn > 0, then\n\n3 Theoretical bounds for the in\ufb02uence of a set of nodes\n\nWe now present our upper bounds on the in\ufb02uence at time T and derive a lower bound on the critical\ntime of a contagion.\n\n3.1 Upper bounds on the maximum in\ufb02uence at time T\n\nThe next proposition provides an upper bound on the in\ufb02uence at time T for any set of in\ufb02uencers A\nsuch that |A| = n0. This result may be valuable for assessing the quality of in\ufb02uence maximization\nalgorithms in a given network.\nProposition 2. De\ufb01ne \u03c1(s) = \u03c1(\nby \u03c3A(T ) the expected number of nodes reached by the cascade starting from A at time T :\n\n). Then, for any A such that |A| = n0 < n, denoting\n\nLH(s)+LH(s)(cid:62)\n\n2\n\n\u03c3A(T ) \u2264 n0 + (n \u2212 n0) min\ns\u22650\n\n\u03b3(s)esT .\n\n(3)\n\n(4)\n\n= 0.\n\n(cid:19)\n\nwhere \u03b3(s) is the smallest solution in [0, 1] of the following equation:\n\n\u03b3(s) \u2212 1 + exp\n\n\u2212\u03c1(s)\u03b3(s) \u2212\n\n\u03c1(s)n0\n\n\u03b3(s)(n \u2212 n0)\n\n(cid:18)\n\n3\n\n\fCorollary 1. Under the same assumptions:\n\n\u03c3A(T ) \u2264 n0 +(cid:112)n0(n \u2212 n0)\n\n(cid:32)(cid:115)\n\n\u03c1(s)\n1 \u2212 \u03c1(s)\n\nmin\n\n{s\u22650|\u03c1(s)<1}\n\n(cid:33)\n\nesT\n\n,\n\n(5)\n\nNote that the long-term upper bound in [17] is a corollary of Proposition 2 using s = 0. When\n\u03c1(0) < 1, Corollary 1 with s = 0 implies that the regime is sub-critical for all T \u2265 0. When\n\u03c1(0) \u2265 1, the long-term behavior may be super-critical and the in\ufb02uence may reach linear values in\n\u221a\nn. However, at a cost growing exponentially with T , it is always possible to choose a s such that\n\u03c1(s) < 1 and retrieve a O(\nn) behavior. While the exact optimal parameter s is in general not\nexplicit, two choices of s derive relevant results: either simplifying esT by choosing s = 1/T , or\nkeeping \u03b3(s) sub-critical by choosing s s.t. \u03c1(s) < 1. In particular, the following corollary shows\nthat the contagion explodes at most as e\u03c1\u22121(1\u2212\u0001)T for any \u0001 \u2208 [0, 1].\nCorollary 2. Let \u0001 \u2208 [0, 1] and \u03c1(0) \u2265 1. Under the same assumptions:\ne\u03c1\u22121(1\u2212\u0001)T .\n\n(6)\nRemark. Since this section focuses on bounding \u03c3A(T ) for a given T \u2265 0, all the aforementioned\nresults also hold for pT\nij(t) = pij(t)1{t\u2264T}. This is equivalent to integrating everything on [0, T ]\n0 pij(t)e\u2212stdt. This choice\nof LH is particularly useful when some edges are transmitting the contagion with probability 1, see\nfor instance the SI epidemic model in Section 4.3).\n\ninstead of R+, i.e. LHij(s) = \u2212 ln(1 \u2212(cid:82) T\n\n0 pij(t)dt)\u22121(cid:82) T\n\n0 pij(t)dt)((cid:82) T\n\n\u03c3A(T ) \u2264 n0 +\n\nn0(n \u2212 n0)\n\n(cid:114)\n\n\u0001\n\n3.2 Lower bound on the critical time of a contagion\n\nThe previous section presents results about how explosive a contagion is. These \ufb01ndings suggest\nthat the speed at which a contagion explodes is bounded by a certain quantity, and thus that the\nprocess needs a certain amount of time to become super-critical. This intuition is made formal in\nthe following corollary:\nCorollary 3. Assume \u2200n \u2265 0, \u03c1n(0) \u2265 1 and limn\u2192+\u221e \u03c1\u22121\nis such that\n\n= 1. If the sequence (Tn)n\u2208N\n\nn (1\u2212 1\n\u22121\nn (1)\n\u03c1\n\nln n )\n\n2\u03c1\u22121\n\nn (1)Tn\nln n\n\nlim sup\nn\u2192+\u221e\n\n< 1.\n\n(7)\n\n(8)\n\n(9)\n\nThen,\n\n\u03c3A(Tn) = o(n).\n\nIn other words, the regime of the contagion is sub-critical before\n\nln n\n\u22121\nn (1)\n\n2\u03c1\n\nand\n\nT c \u2265 lim inf\nn\u2192+\u221e\n\nln n\n2\u03c1\u22121\nn (1)\n\n.\n\nln n )\n\nn (1\u2212 1\n\u22121\nn (1)\n\u03c1\nn (1\u2212 1\n\n= 1 imposes that, for large n, lim\u0001\u21920\n\nThe technical condition limn\u2192+\u221e \u03c1\u22121\nverges suf\ufb01ciently fast to 1 so that \u03c1\u22121\nis not very restrictive, and is met for the different case studies considered in Section 4.\nThis result may be valuable for decision makers since it provides a safe time region in which the\ncontagion has not reached a macroscopic scale. It thus provides insights into how long do decision\nmakers have to prepare control measures. After T c, the process can explode and immediate action\nis required.\n\nln n ) has the same behavior than \u03c1\u22121\n\ncon-\nn (1). This condition\n\nn (1\u2212\u0001)\n\u03c1\u22121\n\u22121\n\u03c1\nn (1)\n\n4 Application to particular contagion models\n\nIn this section, we provide several examples of cascade models that show that our theoretical bounds\nare applicable in a wide range of scenarios and provide the \ufb01rst results of this type in many areas,\nincluding two widely used epidemic models.\n\n4\n\n\f4.1 Fixed transmission pattern\n\nWhen the transmission probabilities are of the form pij(t) = \u03b1ijp(t) s.t.(cid:82) +\u221e\n\n0\n\nLHij(s) = \u2212 ln(1 \u2212 \u03b1ij)\u02c6p(s),\n\np(t) = 1 and \u03b1ij < 1,\n\n(10)\n\nand\n\n\u03c1(s) = \u03c1\u03b1 \u02c6p(s),\n\n(11)\nwhere \u03c1\u03b1 = \u03c1(0) = \u03c1(\u2212 ln(1\u2212\u03b1ij )+ln(1\u2212\u03b1ji)\n) is the long-term hazard matrix de\ufb01ned in [17]. In\nthese networks, the temporal and structural behaviors are clearly separated. While \u03c1\u03b1 summarizes\nthe structure of the network and how connected the nodes are to one another, \u02c6p(s) captures how fast\nthe transmission probabilities are fading through time.\nWhen \u03c1\u03b1 \u2265 1, the long-term behavior is super-critical and the bound on the critical times is given\nby inverting \u02c6p(s)\n\n2\n\nT c \u2265 lim inf\nn\u2192+\u221e\n\nln n\n\n(12)\nwhere \u02c6p\u22121(1/\u03c1\u03b1) exists and is unique since \u02c6p(s) is decreasing from 1 to 0. In general, it is not\npossible to give a more explicit version of the critical time of Corollary 3, or of the anytime in\ufb02uence\nbound of Proposition 2. However, we investigate in the rest of this section speci\ufb01c p(t) which lead\nto explicit results.\n\n2\u02c6p\u22121(1/\u03c1\u03b1)\n\n,\n\n4.2 Exponential transmission probabilities\n\nA notable example of \ufb01xed transmission pattern is the case of exponential probabilities pij(t) =\n\u03b1ij\u03bbe\u2212\u03bbt for \u03bb > 0 and \u03b1ij \u2208 [0, 1[. In\ufb02uence maximization algorithms under this speci\ufb01c choice\nof transmission functions have been for instance developed in [11]. In such a case, we can calculate\nthe spectral radii explicitly:\n\nwhere \u03c1\u03b1 = \u03c1(\u2212 ln(1\u2212\u03b1ij )+ln(1\u2212\u03b1ji)\nleads to a critical time lower bounded by\n\n2\n\n\u03c1(s) =\n\n\u03c1\u03b1,\n\n(13)\n\n\u03bb\n\ns + \u03bb\n\n) is again the long-term hazard matrix. When \u03c1\u03b1 > 1, this\n\nT c \u2265 lim inf\nn\u2192+\u221e\n\nln n\n\n2\u03bb(\u03c1\u03b1 \u2212 1)\n\n.\n\n(14)\n\nThe in\ufb02uence bound of Corollary 1 can also be reformulated in the following way:\nCorollary 4. Assume \u03c1\u03b1 \u2265 1, or else \u03bbT (1 \u2212 \u03c1\u03b1) < 1\ns = 1\n\n2T + \u03bb(\u03c1\u03b1 \u2212 1) and Corollary 1 rewrites:\n\n\u03c3A(T ) \u2264 n0 +(cid:112)n0(n \u2212 n0)(cid:112)2eT \u03bb\u03c1\u03b1e\u03bbT (\u03c1\u03b1\u22121).\n\n2 . Then the minimum in Eq. 5 is met for\n\n(15)\n\nIf \u03c1\u03b1 < 1 and \u03bbT (1 \u2212 \u03c1\u03b1) \u2265 1\n\n2 , the minimum in Eq. 5 is met for s = 0 and Corollary 1 rewrites:\n\n\u03c3A(T ) \u2264 n0 +(cid:112)n0(n \u2212 n0)\n\n(cid:114) \u03c1\u03b1\n\n1 \u2212 \u03c1\u03b1\n\n.\n\n(16)\n\n\u221a\nNote that, in particular, the condition of Corollary 4 is always met in the super-critical case where\n\u03c1\u03b1 > 1. Moreover, we retrieve the O(\n\u03bb(\u03c1\u03b1\u22121). Concerning the behavior\nin T , the bound matches exactly the in\ufb01nite-time bound when T is very large in the sub-critical case.\nHowever, for suf\ufb01ciently small T , we obtain a greatly improved result with a very instructive growth\nin O(\n\nn) behavior when T <\n\nT ).\n\n\u221a\n\n1\n\n4.3 SI and SIR epidemic models\n\nBoth epidemic models SI and SIR are particular cases of exponential transmission probabilities.\nSIR model ([18]) is a widely used epidemic model that uses three states to describe the spread of an\ninfection. Each node of the network can be either : susceptible (S), infected (I), or removed (R). At\n\n5\n\n\ft = 0, a subset A of n0 nodes is infected. Then, each node i infected at time \u03c4i is removed at an\nexponentially-distributed time \u03b8i of parameter \u03b4. Transmission along its outgoing edge (i, j) \u2208 E\noccurs at time \u03c4i + \u03c4ij with conditional probability density \u03b2 exp(\u2212\u03b2\u03c4ij), given that node i has not\nbeen removed at that time. When the removing events are not observed, SIR is equivalent to CT IC,\nexcept that transmission along outgoing edges of one node are positively correlated. However, our\nresults still hold in case of such a correlation, as shown in the following result.\nProposition 3. Assume the propagation follow a SIR model of transmission parameter \u03b2 and re-\n\nmoval parameter \u03b4. De\ufb01ne pij(t) = \u03b2 exp(\u2212(\u03b4 + \u03b2)t) for (i, j) \u2208 E. Let A = (cid:0)1{(i,j)\u2208E}(cid:1)\n\nij\nbe the adjacency matrix of the underlying undirected network. Then, results of Proposition 2 and\nsubsequent corollaries still hold with \u03c1(s) given by:\n\n(cid:18)LH(s) + LH(s)(cid:62)\n\n(cid:19)\n\n(cid:18)\n\n(cid:19) \u03b4 + \u03b2\n\ns + \u03b4 + \u03b2\n\n= ln\n\n1 +\n\n\u03b2\n\u03b4\n\n\u03c1(s) = \u03c1\n\n2\n\n\u03c1(A)\n\n(17)\n\nFrom this proposition, the same analysis than in the independent transmission events case can be\nderived, and the critical time for the SIR model is\n\nT c \u2265 lim inf\nn\u2192+\u221e\n\nln n\n2(\u03b4 + \u03b2)(ln(1 + \u03b2\n\n\u03b4 )\u03c1(A) \u2212 1)\n\n.\n\n(18)\n\nProposition 4. Consider the SIR model with transmission rate \u03b2, recovery rate \u03b4 and adjacency\nmatrix An. Assume lim inf n\u2192+\u221e ln(1 + \u03b2\n\n\u03b4 )\u03c1(An) > 1, and the sequence (Tn)n\u2208N is such that\n\n2(\u03b4 + \u03b2)(ln(1 + \u03b2\nln n\n\n\u03b4 )\u03c1(An) \u2212 1)Tn\n\n< 1.\n\nlim sup\nn\u2192+\u221e\n\nThen,\n\n\u03c3A(Tn) = o(n).\n\n(19)\n\n(20)\n\n\u03b4 )\u03c1(An) \u2212 1).\n\nThis is a direct corollary of Corollary 3 with \u03c1\u22121(1) = (\u03b4 + \u03b2)(ln(1 + \u03b2\nThe SI model is a simpler model in which individuals of the network remain infected and contagious\nthrough time (i.e. \u03b4 = 0). Thus, the network is totally infected at the end of the contagion and\nlimn\u2192+\u221e \u03c3A(T ) = n. For this reason, the previous critical time for the more general SIR model is\nof no use here, and a more precise analysis is required. Following the remark of Section 3.1, we can\nintegrate pij on [0, T ] instead of R+, which leads to the following result:\nProposition 5. Consider the SI model with transmission rate \u03b2 and adjacency matrix An. Assume\nlim inf n\u2192+\u221e \u03c1(An) > 0 and the sequence (Tn)n\u2208N is such that\n\u2212(cid:113) ln n\n\nlim sup\nn\u2192+\u221e\n\n(21)\n\n< 1.\n\n(cid:113) ln n\n\u03b2Tn\n2\u03c1(An) (1 \u2212 e\n\n2\u03c1(An) )\n\nThen,\n\nIn other words, the critical time for the SI model is lower bounded by\n\n\u03c3A(Tn) = o(n).\n\n(cid:115)\n\n\u2212(cid:113) ln n\n\n2\u03c1(An ) ).\n\nT c \u2265 lim inf\nn\u2192+\u221e\n\n1\n\u03b2\n\nln n\n2\u03c1(An)\n\n(1 \u2212 e\n\n(22)\n\n(23)\n\n(cid:113) ln n\nIf \u03c1(An) = o(ln n) (e.g. for sparse networks with a maximum degree in O(1)), the critical time\nresumes to Tc \u2265 lim inf n\u2192+\u221e 1\n2\u03c1(An). However, when the graph is denser and \u03c1(An)/ ln n \u2192\n+\u221e, then Tc \u2265 lim inf n\u2192+\u221e ln n\n\n\u03b2\n\n2\u03b2\u03c1(An).\n\n4.4 Discrete-time Information Cascade\n\nA \ufb01nal example is the discrete-time contagion in which a node infected at time t makes a unique\nattempt to infect its neighbors at a time t + T0. This de\ufb01nes the Information Cascade model, the\n\n6\n\n\f(a) T = 0.1\n\n(b) T = 1\n\n(c) T = 5\n\n(d) T = 100\n\nFigure 1: Empirical maximum in\ufb02uence w.r.t. the spectral radius \u03c1\u03b1 de\ufb01ned in Section 4.2 for vari-\nous network types. Simulation parameters: n = 1000, n0 = 1 and \u03bb = 1.\n\ndiscrete-time diffusion model studied by the \ufb01rst works on in\ufb02uence maximization [2, 19, 3, 4]. In\nthis setting, pij(t) = \u03b1ij\u03b4T0(t) where \u03b4T0 is the Dirac distribution centered at T0. The spectral radii\nare given by\n\n\u03c1(s) = \u03c1\u03b1e\u2212sT0,\n\n(24)\n\n(25)\n\n(26)\n\nand the in\ufb02uence bound of Corollary 1 simpli\ufb01es to:\nCorollary 5. Let \u03c1\u03b1 \u2265 1, or else T \u2264 T0\n\n2(1\u2212\u03c1\u03b1) . If T < T0, then \u03c3A(T ) = n0. Otherwise,\n\n\u03c3A(T ) \u2264 n0 +(cid:112)n0(n \u2212 n0)\n\nT\nT0\n\u03b1 .\n\n\u03c1\n\n(cid:114) 2eT\n\nT0\n\nMoreover, the critical time is lower bounded by\n\nT c \u2265 lim inf\nn\u2192+\u221e\n\nln n\n2 ln \u03c1\u03b1\n\nT0.\n\nA notable difference from the exponential transmission probabilities is that T c is here inversely\nproportional to ln \u03c1\u03b1, instead of \u03c1\u03b1 in Eq. 4.2, which implies that, for the same long-term in\ufb02u-\nence, a discrete-time contagion will explode much slower than one with a constant infection rate.\nThis is probably due to the existence of very small infection times for contagions with exponential\ntransmission probabilities.\n\n5 Experimental results\n\nThis section provides an experimental validation of our bounds, by comparing them to the empirical\nin\ufb02uence simulated on several network types.\nIn all our experiments, we simulate a contagion\nwith exponential transmission probabilities (see Section 4.2) on networks of size n = 1000 and\ngenerated random networks of 5 different types (for more information on the respective random\ngenerators, see e.g [20]): Erd\u00a8os-R\u00b4enyi networks, preferential attachment networks, small-world\nnetworks, geometric random networks ([21]) and totally connected networks with \ufb01xed weight b \u2208\n[0, 1] except for the ingoing and outgoing edges of a single node having, respectively, weight 0\nand a > b. The reason for simulating on such totally connected networks is that the in\ufb02uence\nover these networks tend to match our upper bounds more closely, and plays the role of a best case\n\n7\n\n246810020406080spectral radius (\u03c1\u03b1)influence (\u03c3A(T)) totally connectederdos renyipreferential attachmentsmall worldcontact networkupper bound024681002004006008001000spectral radius (\u03c1\u03b1)influence (\u03c3A(T)) 024681002004006008001000spectral radius (\u03c1\u03b1)influence (\u03c3A(T)) 024681002004006008001000spectral radius (\u03c1\u03b1)influence (\u03c3A(T)) \f(a) T = 0.2T c\u2217\n\n(b) T = 2T c\u2217\n\n(c) T = 5T c\u2217\n\nFigure 2: Empirical maximum in\ufb02uence w.r.t. the network size for various network types. Simula-\ntion parameters: n0 = 1, \u03bb = 1 and \u03c1\u03b1 = 4. In such a setting, T c\u2217 = ln n\n2(\u03c1\u03b1\u22121)\u03bb = 1.15. Note the\nsub-linear (a) versus linear behavior (b and c).\n\nscenario. More precisely, the transmission probabilities are of the form pij(t) = \u03b1e\u2212t for each edge\n(i, j) \u2208 E, where \u03b1 \u2208 [0, 1[ (and \u03bb = 1 in the formulas of Section 4.2).\nWe \ufb01rst investigate the tightness of the upper bound on the maximum in\ufb02uence given in Propo-\nsition 2. Figure 1 presents the empirical in\ufb02uence w.r.t. \u03c1\u03b1 = \u2212 ln(1 \u2212 \u03b1)\u03c1(A) (where A is the\nadjacency matrix of the network) for a large set of network types, as well as the upper bound in\nProposition 2. Each point in the \ufb01gure corresponds to the maximum in\ufb02uence on one network. The\nin\ufb02uence was averaged over 100 cascade simulations, and the best in\ufb02uencer (i.e. whose in\ufb02uence\nwas maximal) was found by performing an exhaustive search. Our bounds are tight for all values\nof T \u2208 {0.1, 1, 5, 100} for totally connected networks in the sub-critical regime (\u03c1\u03b1 < 1). For the\nsuper-critical regime (\u03c1\u03b1 > 1), the behavior in T is very instructive. For T \u2208 {0.1, 5, 100}, we are\ntight for most network types when \u03c1\u03b1 is high. For T = 1 (the average transmission time for the\n(\u03c4ij)(i,j)\u2208E), the maximum in\ufb02uence varies a lot across different graphs. This follows the intuition\nthat this is one of the times where, for a given \ufb01nal number of infected node, the local structure of\nthe networks will play the largest role through precise temporal evolution of the infection. Because\n\u03c1\u03b1 explains quite well the \ufb01nal size of the infection, this discrepancy appears on our graphs at \u03c1\u03b1\n\ufb01xed. While our bound does not seem tight for this particular time, the order of magnitude of the\nexplosion time is retrieved and our bounds are close to optimal values as soon as T = 5.\nIn order to further validate that our bounds give meaningful insights on the critical time of explosion\nfor super-critical graphs, Figure 2 presents the empirical in\ufb02uence with respect to the size of the\nnetwork n for different network types and values of T , with \u03c1\u03b1 \ufb01xed to \u03c1\u03b1 = 4. In this setting, the\ncritical time of Corollary 3 is given by T c\u2217 = ln n\n2(\u03c1\u03b1\u22121)\u03bb = 1.15. We see that our bounds are tight\nfor totally connected networks for all values of T \u2208 {0.2, 2, 5}. Moreover, the accuracy of critical\ntime estimation is proved by the drastic change of behavior around T = T c\u2217, with phase transitions\nhaving occurred for most network types as soon as T = 5T c\u2217.\n\n6 Conclusion\n\nIn this paper, we characterize the phase transition in continuous-time information cascades between\ntheir sub-critical and super-critical behavior. We provide for the \ufb01rst time general in\ufb02uence bounds\nthat apply for any time horizon, graph and set of in\ufb02uencers. We show that the key quantities\ngoverning this phenomenon are the spectral radii of given Laplace Hazard matrices. We prove the\npertinence of our bounds by deriving the \ufb01rst results of this type in several application \ufb01elds. Finally,\nwe provide experimental evidence that our bounds are tight for a large family of networks.\n\nAcknowledgments\n\nThis research is part of the SODATECH project funded by the French Government within the pro-\ngram of \u201cInvestments for the Future \u2013 Big Data\u201d.\n\n8\n\n02004006008001000010203040506070number of nodes (n)influence (\u03c3A(T)) totally connectederdos renyipreferential attachmentsmall worldcontact networkupper bound0200400600800100002004006008001000number of nodes (n)influence (\u03c3A(T)) 0200400600800100002004006008001000number of nodes (n)influence (\u03c3A(T)) \fReferences\n[1] Michael Trusov, Randolph E Bucklin, and Koen Pauwels. Effects of word-of-mouth versus traditional\nmarketing: Findings from an internet social networking site. Journal of marketing, 73(5):90\u2013102, 2009.\n[2] David Kempe, Jon Kleinberg, and \u00b4Eva Tardos. Maximizing the spread of in\ufb02uence through a social\nnetwork. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery\nand Data Mining, pages 137\u2013146. ACM, 2003.\n\n[3] Wei Chen, Yajun Wang, and Siyu Yang. Ef\ufb01cient in\ufb02uence maximization in social networks. In Pro-\nceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,\npages 199\u2013208. ACM, 2009.\n\n[4] Wei Chen, Chi Wang, and Yajun Wang. Scalable in\ufb02uence maximization for prevalent viral marketing\nin large-scale social networks. In Proceedings of the 16th ACM SIGKDD International Conference on\nKnowledge Discovery and Data Mining, pages 1029\u20131038. ACM, 2010.\n\n[5] Manuel Gomez-Rodriguez, David Balduzzi, and Bernhard Sch\u00a8olkopf. Uncovering the temporal dynamics\nof diffusion networks. In Proceedings of the 28th International Conference on Machine Learning, pages\n561\u2013568, 2011.\n\n[6] Nan Du, Le Song, Hyenkyun Woo, and Hongyuan Zha. Uncover topic-sensitive information diffusion net-\nworks. In Proceedings of the Sixteenth International Conference on Arti\ufb01cial Intelligence and Statistics,\npages 229\u2013237, 2013.\n\n[7] Alan G Hawkes and David Oakes. A cluster process representation of a self-exciting process. Journal of\n\nApplied Probability, pages 493\u2013503, 1974.\n\n[8] Ke Zhou, Hongyuan Zha, and Le Song. Learning triggering kernels for multi-dimensional hawkes pro-\ncesses. In Proceedings of the 30th International Conference on Machine Learning, pages 1301\u20131309,\n2013.\n\n[9] Remi Lemonnier and Nicolas Vayatis. Nonparametric markovian learning of triggering kernels for mutu-\nally exciting and mutually inhibiting multivariate hawkes processes. In Machine Learning and Knowledge\nDiscovery in Databases, pages 161\u2013176. Springer, 2014.\n\n[10] Mehrdad Farajtabar, Nan Du, Manuel Gomez-Rodriguez, Isabel Valera, Hongyuan Zha, and Le Song.\nShaping social activity by incentivizing users. In Advances in Neural Information Processing Systems,\npages 2474\u20132482, 2014.\n\n[11] Manuel Gomez-Rodriguez and Bernhard Sch\u00a8olkopf. In\ufb02uence maximization in continuous time diffusion\nnetworks. In Proceedings of the 29th International Conference on Machine Learning, pages 313\u2013320,\n2012.\n\n[12] Nan Du, Le Song, Manuel Gomez-Rodriguez, and Hongyuan Zha. Scalable in\ufb02uence estimation in\ncontinuous-time diffusion networks. In Advances in Neural Information Processing Systems, pages 3147\u2013\n3155, 2013.\n\n[13] Manuel Gomez-Rodriguez, Le Song, Hadi Daneshmand, and B. Schoelkopf. Estimating diffusion net-\nworks: Recovery conditions, sample complexity & soft-thresholding algorithm. Journal of Machine\nLearning Research, 2015.\n\n[14] Jean Pouget-Abadie and Thibaut Horel. Inferring graphs from cascades: A sparse recovery framework.\n\nIn Proceedings of the 32nd International Conference on Machine Learning, pages 977\u2013986, 2015.\n\n[15] Moez Draief, Ayalvadi Ganesh, and Laurent Massouli\u00b4e. Thresholds for virus spread on networks. Annals\n\nof Applied Probability, 18(2):359\u2013378, 2008.\n\n[16] B\u00b4ela Bollob\u00b4as, Svante Janson, and Oliver Riordan. The phase transition in inhomogeneous random\n\ngraphs. Random Structures & Algorithms, 31(1):3\u2013122, 2007.\n\n[17] Remi Lemonnier, Kevin Scaman, and Nicolas Vayatis. Tight bounds for in\ufb02uence in diffusion networks\nand application to bond percolation and epidemiology. In Advances in Neural Information Processing\nSystems, pages 846\u2013854, 2014.\n\n[18] William O Kermack and Anderson G McKendrick. Contributions to the mathematical theory of epi-\ndemics. ii. the problem of endemicity. Proceedings of the Royal society of London. Series A, 138(834):55\u2013\n83, 1932.\n\n[19] Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie\nGlance. Cost-effective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD Inter-\nnational Conference on Knowledge Discovery and Data Mining, pages 420\u2013429. ACM, 2007.\n\n[20] Mark Newman. Networks: An Introduction. Oxford University Press, Inc., New York, NY, USA, 2010.\n[21] Mathew Penrose. Random geometric graphs, volume 5. Oxford University Press Oxford, 2003.\n\n9\n\n\f", "award": [], "sourceid": 1220, "authors": [{"given_name": "Kevin", "family_name": "Scaman", "institution": "ENS Cachan - CMLA"}, {"given_name": "R\u00e9mi", "family_name": "Lemonnier", "institution": "ENS Cachan - CMLA"}, {"given_name": "Nicolas", "family_name": "Vayatis", "institution": "ENS Cachan - CMLA"}]}