On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks

Part of Advances in Neural Information Processing Systems 25 (NIPS 2012)

Bibtex Metadata Paper Supplemental

Authors

Qirong Ho, Junming Yin, Eric Xing

Abstract

In this paper, we argue for representing networks as a bag of {\it triangular motifs}, particularly for important network problems that current model-based approaches handle poorly due to computational bottlenecks incurred by using edge representations. Such approaches require both 1-edges and 0-edges (missing edges) to be provided as input, and as a consequence, approximate inference algorithms for these models usually require $\Omega(N^2)$ time per iteration, precluding their application to larger real-world networks. In contrast, triangular modeling requires less computation, while providing equivalent or better inference quality. A triangular motif is a vertex triple containing 2 or 3 edges, and the number of such motifs is $\Theta(\sum_{i}D_{i}^{2})$ (where $D_i$ is the degree of vertex $i$), which is much smaller than $N^2$ for low-maximum-degree networks. Using this representation, we develop a novel mixed-membership network model and approximate inference algorithm suitable for large networks with low max-degree. For networks with high maximum degree, the triangular motifs can be naturally subsampled in a {\it node-centric} fashion, allowing for much faster inference at a small cost in accuracy. Empirically, we demonstrate that our approach, when compared to that of an edge-based model, has faster runtime and improved accuracy for mixed-membership community detection. We conclude with a large-scale demonstration on an $N\approx 280,000$-node network, which is infeasible for network models with $\Omega(N^2)$ inference cost.