{"title": "Shaping Social Activity by Incentivizing Users", "book": "Advances in Neural Information Processing Systems", "page_first": 2474, "page_last": 2482, "abstract": "Events in an online social network can be categorized roughly into endogenous events, where users just respond to the actions of their neighbors within the network, or exogenous events, where users take actions due to drives external to the network. How much external drive should be provided to each user, such that the network activity can be steered towards a target state? In this paper, we model social events using multivariate Hawkes processes, which can capture both endogenous and exogenous event intensities, and derive a time dependent linear relation between the intensity of exogenous events and the overall network activity. Exploiting this connection, we develop a convex optimization framework for determining the required level of external drive in order for the network to reach a desired activity level. We experimented with event data gathered from Twitter, and show that our method can steer the activity of the network more accurately than alternatives.", "full_text": "Shaping Social Activity by Incentivizing Users\n\nMehrdad Farajtabar\u2217\nIsabel Valera\u2021\n\nNan Du\u2217\nHongyuan Zha\u2217\n\nManuel Gomez-Rodriguez\u2020\n\nLe Song\u2217\n\nGeorgia Institute of Technology\u2217\n\nMPI for Software Systems\u2020\n\nUniv. Carlos III in Madrid\u2021\n\n{mehrdad,dunan}@gatech.edu\n{zha,lsong}@cc.gatech.edu\n\nmanuelgr@mpi-sws.org\nivalera@tsc.uc3m.es\n\nAbstract\n\nEvents in an online social network can be categorized roughly into endogenous\nevents, where users just respond to the actions of their neighbors within the net-\nwork, or exogenous events, where users take actions due to drives external to the\nnetwork. How much external drive should be provided to each user, such that the\nnetwork activity can be steered towards a target state? In this paper, we model\nsocial events using multivariate Hawkes processes, which can capture both en-\ndogenous and exogenous event intensities, and derive a time dependent linear re-\nlation between the intensity of exogenous events and the overall network activity.\nExploiting this connection, we develop a convex optimization framework for de-\ntermining the required level of external drive in order for the network to reach a\ndesired activity level. We experimented with event data gathered from Twitter,\nand show that our method can steer the activity of the network more accurately\nthan alternatives.\n\n1 Introduction\nOnline social platforms routinely track and record a large volume of event data, which may corre-\nspond to the usage of a service (e.g., url shortening service, bit.ly). These events can be categorized\nroughly into endogenous events, where users just respond to the actions of their neighbors within\nthe network, or exogenous events, where users take actions due to drives external to the network.\nFor instance, a user\u2019s tweets may contain links provided by bit.ly, either due to his forwarding of a\nlink from his friends, or due to his own initiative to use the service to create a new link.\nCan we model and exploit these data to steer the online community to a desired activity level?\nSpeci\ufb01cally, can we drive the overall usage of a service to a certain level (e.g., at least twice per\nday per user) by incentivizing a small number of users to take more initiatives? What if the goal is\nto make the usage level of a service more homogeneous across users? What about maximizing the\noverall service usage for a target group of users? Furthermore, these activity shaping problems need\nto be addressed by taking into account budget constraints, since incentives are usually provided in\nthe form of monetary or credit rewards.\nActivity shaping problems are signi\ufb01cantly more challenging than traditional in\ufb02uence maximiza-\ntion problems, which aim to identify a set of users, who, when convinced to adopt a product, shall\nin\ufb02uence others in the network and trigger a large cascade of adoptions [1, 2]. First, in in\ufb02uence\nmaximization, the state of each user is often assumed to be binary, either adopting a product or\nnot [1, 3, 4, 5]. However, such assumption does not capture the recurrent nature of product usage,\nwhere the frequency of the usage matters. Second, while in\ufb02uence maximization methods identify\na set of users to provide incentives, they do not typically provide a quantitative prescription on how\nmuch incentive should be provided to each user. Third, activity shaping concerns a larger variety of\ntarget states, such as minimum activity and homogeneity of activity, not just activity maximization.\nIn this paper, we will address the activity shaping problems using multivariate Hawkes processes [6],\nwhich can model both endogenous and exogenous recurrent social events, and were shown to be a\ngood \ufb01t for such data in a number of recent works (e.g., [7, 8, 9, 10, 11, 12]). More importantly,\n\n1\n\n\fwe will go beyond model \ufb01tting, and derive a novel predictive formula for the overall network ac-\ntivity given the intensity of exogenous events in individual users, using a connection between the\nprocesses and branching processes [13, 14, 15, 16]. Based on this relation, we propose a convex\noptimization framework to address a diverse range of activity shaping problems given budget con-\nstraints. Compared to previous methods for in\ufb02uence maximization, our framework can provide\nmore \ufb01ne-grained control of network activity, not only steering the network to a desired steady-state\nactivity level but also do so in a time-sensitive fashion. For example, our framework allows us to\nanswer complex time-sensitive queries, such as, which users should be incentivized, and by how\nmuch, to steer a set of users to use a product twice per week after one month?\nIn addition to the novel framework, we also develop an ef\ufb01cient gradient based optimization al-\ngorithm, where the matrix exponential needed for gradient computation is approximated using the\ntruncated Taylor series expansion [17]. This algorithm allows us to validate our framework in a\nvariety of activity shaping tasks and scale up to networks with tens of thousands of nodes. We also\nconducted experiments on a network of 60,000 Twitter users and more than 7,500,000 uses of a pop-\nular url shortening services. Using held-out data, we show that our algorithm can shape the network\nbehavior much more accurately than alternatives.\n2 Modeling Endogenous-Exogenous Recurrent Social Events\nWe model the events generated by m users in a social network as a m-dimensional counting process\nN (t) = (N1(t), N2(t), . . . , Nm(t))\", where Ni(t) records the total number of events generated by\nuser i up to time t. Furthermore, we represent each event as a tuple (ui, ti), where ui is the user iden-\ntity and ti is the event timing. Let the history of the process up to time t be Ht := {(ui, ti)| ti ! t},\nand Ht\u2212 be the history until just before time t. Then the increment of the process, dN (t), in an in-\n\ufb01nitesimal window [t, t + dt] is parametrized by the intensity \u03bb(t) = (\u03bb1(t), . . . ,\u03bb m(t))\" \" 0, i.e.,\n(1)\nIntuitively, the larger the intensity \u03bb(t), the greater the likelihood of observing an event in the time\nwindow [t, t + dt]. For instance, a Poisson process in [0,\u221e) can be viewed as a special counting\nprocess with a constant intensity function \u03bb, independent of time and history. To model the presence\nof both endogenous and exogenous events, we will decompose the intensity into two terms\n\nE[dN (t)|Ht\u2212] = \u03bb(t) dt.\n\n=\n\n+\n\n,\n\n(2)\n\noverall event intensity\n\nexogenous event intensity\n\nendogenous event intensity\n\nwhere the exogenous event intensity models drive outside the network, and the endogenous event\nintensity models interactions within the network. We assume that hosts of social platforms can\npotentially drive up or down the exogenous events intensity by providing incentives to users; while\nendogenous events are generated due to users\u2019 own interests or under the in\ufb02uence of network peers,\nand the hosts do not interfere with them directly. The key questions in the activity shaping context\nare how to model the endogenous event intensity which are realistic to recurrent social interactions,\nand how to link the exogenous event intensity to the endogenous event intensity. We assume that the\nexogenous event intensity is independent of the history and time, i.e., \u03bb(0)(t) = \u03bb(0).\n2.1 Multivariate Hawkes Process\nRecurrent endogenous events often exhibit the characteristics of self-excitation, where a user tends\nto repeat what he has been doing recently, and mutual-excitation, where a user simply follows what\nhis neighbors are doing due to peer pressure. These social phenomena have been made analogy to\nthe occurrence of earthquake [18] and the spread of epidemics [19], and can be well-captured by\nmultivariate Hawkes processes [6] as shown in a number of recent works (e.g., [7, 8, 9, 10, 11, 12]).\nMore speci\ufb01cally, a multivariate Hawkes process is a counting process who has a particular form\nof intensity. We assume that the strength of in\ufb02uence between users is parameterized by a sparse\nnonnegative in\ufb02uence matrix A = (auu!)u,u!\u2208[m], where auu! > 0 means user u% directly excites\nuser u. We also allow A to have nonnegative diagonals to model self-excitation of a user. Then, the\nintensity of the u-th dimension is\n\n\u03bb\u2217u(t) =%i:ti \u00b5j, and vj = 0, otherwise.\n(ii) Minimax activity shaping:\ng(\u03bb(0)) = \u03a8(t)\"e, where e is de\ufb01ned such that ej = 1 if \u00b5j = \u00b5min, and ej = 0, otherwise. (iii)\n\nLeast-squares activity shaping: g(\u03bb(0)) = 2\u03a8(t)\"B\"*B\u03a8(t)\u03bb(0) \u2212 v+ . (iv) Activity homoge-\nnization: g(\u03bb(0)) = \u03a8(t)\" ln (\u03a8(t)\u03bb(0)) + \u03a8(t)\"1, where ln(\u00b7) on a vector is the element-wise\nnatural logarithm. Since the activity maximization and the minimax activity shaping tasks require\nonly one evaluation of \u03a8(t) times a vector, Algorithm 1 can be used directly. However, computing\nthe gradient for least-squares activity shaping and activity homogenization is slightly more involved\nand it requires to be careful with the order in which we perform the operations (Refer to Appendix B\nfor details). Equipped with an ef\ufb01cient way to compute of gradients, we solve the corresponding\nconvex optimization problem for each activity shaping problem by applying projected gradient de-\nscent (PGD) [26] with the appropriate gradient1. Algorithm 2 summarizes the key steps.\n6 Experimental Evaluation\nWe evaluate our framework using both simulated and real world held-out data, and show that our\napproach signi\ufb01cantly outperforms several baselines. The appendix contains additional experiments.\nDataset description and network inference. We use data gathered from Twitter as reported in [27],\nwhich comprises of all public tweets posted by 60,000 users during a 8-month period, from January\n2009 to September 2009. For every user, we record the times she uses any of six popular url short-\nening services (refer to Appendix C for details). We evaluate the performance of our framework on\na subset of 2,241 active users, linked by 4,901 edges, which we call 2K dataset, and we evaluate its\nscalability on the overall 60,000 users, linked by \u223c 200,000 edges, which we call 60K dataset. The\n2K dataset accounts for 691,020 url shortened service uses while the 60K dataset accounts for \u223c7.5\nmillion uses. Finally, we treat each service as independent cascades of events.\nIn the experiments, we estimated the nonnegative in\ufb02uence matrix A and the exogenous intensity\n\u03bb(0) using maximum log-likelihood, as in previous work [8, 9, 12]. We used a temporal resolution\nof one minute and selected the bandwidth \u03c9 = 0.1 by cross validation. Loosely speaking, \u03c9 = 0.1\ncorresponds to loosing 70% of the initial in\ufb02uence after 10 minutes, which may be explained by the\nrapid rate at which each user\u2019 news feed gets updated.\nEvaluation schemes. We focus on three tasks: capped activity maximization, minimax activity\nshaping, and least square activity shaping. We set the total budget to C = 0.5, which corresponds\nto supporting a total extra activity equal to 0.5 actions per unit time, and assume all users entail the\nsame cost. In the capped activity maximization, we set the upper limit of each user\u2019s intensity, \u03b1,\nby adding a nonnegative random vector to their inferred initial intensity. In the least-squares activity\nshaping, we set B = I and aim to create three user groups: less-active, moderate, and super-active.\nWe use three different evaluation schemes, with an increasing resemblance to a real world scenario:\nTheoretical objective: We compute the expected overall (theoretical) intensity by applying Theo-\nrem 3 on the optimal exogenous event intensities to each of the three activity shaping tasks, as well\nas the learned A and \u03c9. We then compute and report the value of the objective functions.\n\n1For nondifferential objectives, subgradient algorithms can be used instead.\n\n6\n\n\fy\nt\ni\nv\ni\nt\nc\na\n \n\u2019\ns\nr\ne\ns\nu\n \nf\no\n \nm\nu\ns\n\n \n\ny\nt\ni\nv\ni\nt\nc\na\nm\nu\nm\nn\nm\n\ni\n\ni\n\ne\nc\nn\na\n\nt\ns\nd\n\ni\n\n \n\nn\na\ne\nd\n\ni\nl\n\nc\nu\nE\n\n*\n\n CAM\n*\n\n XMU\n\n WEI\n\n DEG\n\n PRK\n\nMMASH\n\nUNI\n\nMINMU LP\n\nGRD\n\nPROP\n\nLSGRD\nLSASH\n(c) Held-out data\n\nCAM XMU WEI\n\nDEG\n\n \n\nPRK\n\n0.75\n0.7\n0.65\n0.6\n\n \n\n0 1 2 3 4 5 6 7 8 9\n\nlogarithm of time\n\nx 10\u22124\nMMASH\n\nUNI\n\nMINMU\n\nLP\n\n \n\nGRD\n\n6\n\n4\n\n2\n\n \n\n0\n0 1 2 3 4 5 6 7 8 9\n1.8x 10(cid:239)4\n\nlogarithm of time\n\nPROP\n\nLSGRD\n\nLSASH\n\n1.6\n\n1.4\n\n1.2\n\n \n\n0 1 2 3 4 5 6 7 8 9\n\nlogarithm of time\n\n \n\ne\nc\nn\na\n\nt\ns\nd\n\ni\n\n \n\nn\na\ne\nd\n\ni\nl\n\nc\nu\nE\n\n(a) Theoretical objective\n\ny\nt\ni\nv\ni\nt\nc\na\n \n\u2019\ns\nr\ne\ns\nu\n \nf\no\n \nm\nu\ns\n\nCAM XMU WEI\n\nDEG\n\n \n\nPRK\n\n0.75\n\n0.7\n\n0.65\n\n0.6\n\n \n\n0 1 2 3 4 5 6 7 8 9\n\nlogarithm of time\n\nx 10\u22124\nMMASH\n\nUNI\n\nMINMU\n\nLP\n\nGRD\n\n \n\ny\nt\ni\nv\ni\nt\nc\na\nm\nu\nm\nn\nm\n\ni\n\ni\n\n6\n5\n4\n3\n2\n0 1 2 3 4 5 6 7 8 9\n1.8x 10(cid:239)4\n\nlogarithm of time\n\n \n\nPROP\n\nLSGRD\n\nLSASH\n\n \n\n \n\n1.6\n\n1.4\n\n1.2\n\n \n\n0 1 2 3 4 5 6 7 8 9\n\nlogarithm of time\n\n(b) Simulated objective\n\nn\no\n\ni\nt\n\nl\n\na\ne\nr\nr\no\nc\n \nk\nn\na\nr\n\nn\no\n\ni\nt\n\nl\n\na\ne\nr\nr\no\nc\n \nk\nn\na\nr\n\nn\no\n\ni\nt\n\nl\n\na\ne\nr\nr\no\nc\n \nk\nn\na\nr\n\n1\n\n0.5\n\n0\n\n0.6\n\n0.4\n\n0.2\n\n0\n\n0.8\n0.6\n0.4\n0.2\n0\n\nFigure 2: Row 1: Capped activity maximization. Row 2: Minimax activity shaping. Row 3: Least-\nsquares activity shaping. * means statistical signi\ufb01cant at level of 0.01 with paired t-test between\nour method and the second best\nSimulated objective: We simulate 50 cascades with Ogata\u2019s thinning algorithm [28], using the opti-\nmal exogenous event intensities to each of the three activity shaping tasks, and the learned A and \u03c9.\nWe then estimate empirically the overall event intensity based on the simulated cascades, by com-\nputing a running average over non-overlapping time windows, and report the value of the objective\nfunctions based on this estimated overall intensity. Appendix D provides a comparison between the\nsimulated and the theoretical objective.\nHeld-out data: The most interesting evaluation scheme would entail carrying out real interventions\nin a social platform. However, since this is very challenging to do, instead, in this evaluation scheme,\nwe use held-out data to simulate such process, proceeding as follows. We \ufb01rst partition the 8-month\ndata into 50 \ufb01ve-day long contiguous intervals. Then, we use one interval for training and the\nremaining 49 intervals for testing. Suppose interval 1 is used for training, the procedure is as follows:\n1 using the events from interval 1. Then, we \ufb01x A1 and \u03c91,\n\n1. We estimate A1, \u03c91 and \u03bb(0)\n\nand estimate \u03bb(0)\n\ni\n\nfor all other intervals, i = 2, . . . , 49.\n\ni\n\n2. Given A1 and \u03c91, we \ufb01nd the optimal exogenous event intensities, \u03bb(0)\n\nopt, for each of the\nthree activity shaping task, by solving the associated convex program. We then sort the\nestimated \u03bb(0)\nopt, using the Euclidean\ndistance \u2019\u03bb(0)\n3. We estimate the overall event intensity for each of the 49 intervals (i = 2, . . . , 49), as in the\n\u201csimulated objective\u201d evaluation scheme, and sort these intervals according to the value of\ntheir corresponding objective function.\n\n(i = 2, . . . , 49) according to their similarity to \u03bb(0)\n\nopt \u2212 \u03bb(0)\n\n4. Last, we compute and report the rank correlation score between the two orderings obtained\n\ni \u20192.\n\nin step 2 and 3.2 The larger the rank correlation, the better the method.\n\nWe repeat this procedure 50 times, choosing each different interval for training once, and compute\nand report the average rank correlations. More details can be found in the appendix.\n\n2rank correlation = number of pairs with consistent ordering / total number of pairs.\n\n7\n\n\fCapped activity maximization (CAM). We compare to a number of alternatives. XMU: heuristic\nbased on \u00b5(t) without optimization; DEG and WEI: heuristics based on the degree of the user;\nPRANK: heuristic based on page rank (refer to Appendix C for further details). The \ufb01rst row of\nFigure 2 summarizes the results for the three different evaluation schemes. We \ufb01nd that our method\n(CAM) consistently outperforms the alternatives. For the theoretical objective, CAM is 11 % better\nthan the second best, DEG. The difference in overall users\u2019 intensity from DEG is about 0.8 which,\nroughly speaking, leads to at least an increase of about 0.8 \u00d7 60 \u00d7 24 \u00d7 30 = 34, 560 in the overall\nnumber of events in a month. In terms of simulated objective and held-out data, the results are\nsimilar and provide empirical evidence that, compared to other heuristics, degree is an appropriate\nsurrogate for in\ufb02uence, while, based on the poor performance of XMU, it seems that high activity\ndoes not necessarily entail being in\ufb02uential. To elaborate on the interpretability of the real-world\nexperiment on held-out data, consider for example the difference in rank correlation between CAM\nand DEG, which is almost 0.1. Then, roughly speaking, this means that incentivizing users based\non our approach accommodates with the ordering of real activity patterns in 0.1 \u00d7 50\u00d749\n2 = 122.5\nmore pairs of realizations.\nMinimax activity shaping (MMASH). We compare to a number of alternatives. UNI: heuristic\nbased on equal allocation; MINMU: heuristic based on \u00b5(t) without optimization; LP: linear pro-\ngramming based heuristic; GRD: a greedy approach to leverage the activity (see Appendix C for\nmore details). The second row of Figure 2 summarizes the results for the three different evaluation\nschemes. We \ufb01nd that our method (MMASH) consistently outperforms the alternatives. For the the-\noretical objective, it is about 2\u00d7 better than the second best, LP. Importantly, the difference between\nMMASH and LP is not tri\ufb02ing and the least active user carries out 2\u00d710\u22124\u00d760\u00d724\u00d730 = 4.3 more\nactions in average over a month. As one may have expected, GRD and LP are the best among the\nheuristics. The poor performance of MINMU, which is directly related to the objective of MMASH,\nmay be because it assigns the budget to a low active user, regardless of their in\ufb02uence. However,\nour method, by cleverly distributing the budget to the users whom actions trigger many other users\u2019\nactions (like those ones with low activity), it bene\ufb01ts from the budget most. In terms of simulated\nobjective and held-out data, the algorithms\u2019 performance become more similar.\nLeast-squares activity shaping (LSASH). We compare to two alternatives. PROP: Assigning the\nbudget proportionally to the desired activity; LSGRD: greedily allocating budget according the dif-\nference between current and desired activity (refer to Appendix C for more details). The third row of\nFigure 2 summarizes the results for the three different evaluation schemes. We \ufb01nd that our method\n(LSASH) consistently outperforms the alternatives. Perhaps surprisingly, PROP, despite its simplic-\nity, seems to perform slightly better than LSGRD. This is may be due to the way it allocates the\nbudget to users, e.g., it does not aim to strictly ful\ufb01ll users\u2019 target activity but bene\ufb01t more users by\nassigning budget proportionally. Refer to Appendix E for additional experiments.\n\nSparsity and Activity Shaping. In some applications there is a limitation on the number of users we\ncan incentivize. In our proposed framework, we can handle this requirement by including a sparsity\nconstraint on the optimization problem.\nIn order to maintain the convexity of the optimization\nproblem, we consider a l1 regularization term, where a regularization parameter \u03b3 provides the\ntrade-off between sparsity and the activity shaping goal. Refer to Appendix F for more details and\nexperimental results for different values of \u03b3.\n\nScalability. The most computationally demanding part of the proposed algorithm is the evaluation of\nmatrix exponentials, which we scale up by utilizing techniques from matrix algebra, such as GMRES\nand Al-Mohy methods. As a result, we are able to run our methods in a reasonable amount of time\non the 60K dataset, speci\ufb01cally, in comparison with a naive implementation of matrix exponential\nevaluations. Refer to Appendix G for detailed experimental results on scalability.\nAppendix H discusses the limitations of our framework and future work.\n\nAcknowledgement. This project was supported in part by NSF IIS1116886, NSF/NIH BIGDATA\n1R01GM108341, NSF CAREER IIS1350983 and Raytheon Faculty Fellowship to Le Song. Is-\nabel Valera acknowledge the support of Plan Regional-Programas I+D of Comunidad de Madrid\n(AGES-CM S2010/BMD-2422), Ministerio de Ciencia e Innovaci\u00b4on of Spain (project DEIPRO\nTEC2009-14504-C02-00 and program Consolider-Ingenio 2010 CSD2008-00010 COMONSENS).\n\n8\n\n\fReferences\n[1] David Kempe, Jon Kleinberg, and \u00b4Eva Tardos. Maximizing the spread of in\ufb02uence through a social\n\nnetwork. In KDD, pages 137\u2013146. ACM, 2003.\n\n[2] Matthew Richardson and Pedro Domingos. Mining knowledge-sharing sites for viral marketing. In KDD,\n\npages 61\u201370. ACM, 2002.\n\n[3] Wei Chen, Yajun Wang, and Siyu Yang. Ef\ufb01cient in\ufb02uence maximization in social networks. In KDD,\n\npages 199\u2013208. ACM, 2009.\n\n[4] Manuel G. Rodriguez and Bernard Sch\u00a8olkopf.\n\nnetworks. In ICML, 2012.\n\nIn\ufb02uence maximization in continuous time diffusion\n\n[5] Nan Du, Le Song, Manuel Gomez Rodriguez, and Hongyuan Zha. Scalable in\ufb02uence estimation in\n\ncontinuous-time diffusion networks. In NIPS 26, 2013.\n\n[6] Thomas .J. Liniger. Multivariate Hawkes Processes. PhD thesis, SWISS FEDERAL INSTITUTE OF\n\nTECHNOLOGY ZURICH, 2009.\n\n[7] Charles Blundell, Jeff Beck, and Katherine A Heller. Modelling reciprocating relationships with Hawkes\n\nprocesses. In NIPS, 2012.\n\n[8] Ke Zhou, Hongyuan Zha, and Le Song. Learning social infectivity in sparse low-rank networks using\n\nmulti-dimensional Hawkes processes. In AISTATS, 2013.\n\n[9] Ke Zhou, Hongyuan Zha, and Le Song. Learning triggering kernels for multi-dimensional Hawkes pro-\n\ncesses. In ICML, 2013.\n\n[10] Tomoharu Iwata, Amar Shah, and Zoubin Ghahramani. Discovering latent in\ufb02uence in online social\n\nactivities via shared cascade poisson processes. In KDD, pages 266\u2013274. ACM, 2013.\n\n[11] Scott W Linderman and Ryan P Adams. Discovering latent network structure in point process data. arXiv\n\npreprint arXiv:1402.0914, 2014.\n\n[12] Isabel Valera, Manuel Gomez-Rodriguez, and Krishna Gummadi. Modeling adoption of competing prod-\n\nucts and conventions in social media. arXiv preprint arXiv:1406.0516, 2014.\n\n[13] Ian Dobson, Benjamin A Carreras, and David E Newman. A branching process approximation to cas-\ncading load-dependent system failure. In System Sciences, 2004. Proceedings of the 37th Annual Hawaii\nInternational Conference on, pages 10\u2013pp. IEEE, 2004.\n\n[14] Jakob Gulddahl Rasmussen. Bayesian inference for Hawkes processes. Methodology and Computing in\n\nApplied Probability, 15(3):623\u2013642, 2013.\n\n[15] Alejandro Veen and Frederic P Schoenberg. Estimation of space\u2013time branching process models in seis-\n\nmology using an em\u2013type algorithm. JASA, 103(482):614\u2013624, 2008.\n\n[16] Jiancang Zhuang, Yosihiko Ogata, and David Vere-Jones. Stochastic declustering of space-time earth-\n\nquake occurrences. JASA, 97(458):369\u2013380, 2002.\n\n[17] Awad H Al-Mohy and Nicholas J Higham. Computing the action of the matrix exponential, with an\n\napplication to exponential integrators. SIAM journal on scienti\ufb01c computing, 33(2):488\u2013511, 2011.\n\n[18] David Marsan and Olivier Lengline. Extending earthquakes\u2019 reach through cascading.\n\n319(5866):1076\u20131079, 2008.\n\nScience,\n\n[19] Shuang-Hong Yang and Hongyuan Zha. Mixture of mutually exciting processes for viral diffusion. In\n\nICML, pages 1-9, 2013.\n\n[20] Theodore E Harris. The theory of branching processes. Courier Dover Publications, 2002.\n[21] Alan G Hawkes. Spectra of some self-exciting and mutually exciting point processes. Biometrika,\n\n58(1):83\u201390, 1971.\n\n[22] John Frank Charles Kingman. Poisson processes, volume 3. Oxford university press, 1992.\n[23] Manuel Gomez-Rodriguez, Krishna Gummadi, and Bernhard Schoelkopf. Quantifying Information Over-\n\nload in Social Media and its Impact on Social Contagions. In ICWSM, 2014.\n\n[24] Gene H Golub and Charles F Van Loan. Matrix computations, volume 3. JHU Press, 2012.\n[25] Youcef Saad and Martin H Schultz. Gmres: A generalized minimal residual algorithm for solving non-\n\nsymmetric linear systems. SIAM Journal on scienti\ufb01c and statistical computing, 7(3):856\u2013869, 1986.\n\n[26] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge,\n\nEngland, 2004.\n\n[27] Meeyoung Cha, Hamed Haddadi, Fabricio Benevenuto, and P Krishna Gummadi. Measuring User In\ufb02u-\n\nence in Twitter: The Million Follower Fallacy. In ICWSM, 2010.\n\n[28] Yosihiko Ogata. On lewis\u2019 simulation method for point processes. Information Theory, IEEE Transactions\n\non, 27(1):23\u201331, 1981.\n\n9\n\n\f", "award": [], "sourceid": 1302, "authors": [{"given_name": "Mehrdad", "family_name": "Farajtabar", "institution": "Georgia Institute of Technolog"}, {"given_name": "Nan", "family_name": "Du", "institution": "Georgia Tech"}, {"given_name": "Manuel", "family_name": "Gomez Rodriguez", "institution": "MPI for Software Systems"}, {"given_name": "Isabel", "family_name": "Valera", "institution": "UC3M"}, {"given_name": "Hongyuan", "family_name": "Zha", "institution": "Georgia Tech"}, {"given_name": "Le", "family_name": "Song", "institution": "Georgia Tech"}]}