Part of Advances in Neural Information Processing Systems 26 (NIPS 2013)
Nan Du, Le Song, Manuel Gomez Rodriguez, Hongyuan Zha
If a piece of information is released from a media site, can it spread, in 1 month, to a million web pages? This influence estimation problem is very challenging since both the time-sensitive nature of the problem and the issue of scalability need to be addressed simultaneously. In this paper, we propose a randomized algorithm for influence estimation in continuous-time diffusion networks. Our algorithm can estimate the influence of every node in a network with $|\Vcal|$ nodes and $|\Ecal|$ edges to an accuracy of $\epsilon$ using $n=O(1/\epsilon^2)$ randomizations and up to logarithmic factors $O(n|\Ecal|+n|\Vcal|)$ computations. When used as a subroutine in a greedy influence maximization algorithm, our proposed method is guaranteed to find a set of nodes with an influence of at least $(1 - 1/e)\operatorname{OPT} - 2\epsilon$, where $\operatorname{OPT}$ is the optimal value. Experiments on both synthetic and real-world data show that the proposed method can easily scale up to networks of millions of nodes while significantly improves over previous state-of-the-arts in terms of the accuracy of the estimated influence and the quality of the selected nodes in maximizing the influence.