{"title": "Variational Graph Recurrent Neural Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 10701, "page_last": 10711, "abstract": "Representation learning over graph structured data has been mostly studied in static graph settings while efforts for modeling dynamic graphs are still scant. In this paper, we develop a novel hierarchical variational model that introduces additional latent random variables to jointly model the hidden states of a graph recurrent neural network (GRNN) to capture both topology and node attribute changes in dynamic graphs. We argue that the use of high-level latent random variables in this variational GRNN (VGRNN) can better capture potential variability observed in dynamic graphs as well as the uncertainty of node latent representation. With semi-implicit variational inference developed for this new VGRNN architecture (SI-VGRNN), we show that flexible non-Gaussian latent representations can further help dynamic graph analytic tasks. Our experiments with multiple real-world dynamic graph datasets demonstrate that SI-VGRNN and VGRNN consistently outperform the existing baseline and state-of-the-art methods by a significant margin in dynamic link prediction.", "full_text": "Variational Graph Recurrent Neural Networks\n\nEhsan Hajiramezanali\u2020\u2217, Arman Hasanzadeh\u2020\u2217, Nick Duf\ufb01eld\u2020, Krishna Narayanan\u2020,\n\nMingyuan Zhou\u2021, Xiaoning Qian\u2020\n\n\u2020 Department of Electrical and Computer Engineering, Texas A&M University\n\n{ehsanr, armanihm, duffieldng, krn, xqian}@tamu.edu\n\u2021 McCombs School of Business, The University of Texas at Austin\n\nmingyuan.zhou@mccombs.utexas.edu\n\nAbstract\n\nRepresentation learning over graph structured data has been mostly studied in static\ngraph settings while efforts for modeling dynamic graphs are still scant. In this\npaper, we develop a novel hierarchical variational model that introduces additional\nlatent random variables to jointly model the hidden states of a graph recurrent\nneural network (GRNN) to capture both topology and node attribute changes in\ndynamic graphs. We argue that the use of high-level latent random variables in\nthis variational GRNN (VGRNN) can better capture potential variability observed\nin dynamic graphs as well as the uncertainty of node latent representation. With\nsemi-implicit variational inference developed for this new VGRNN architecture (SI-\nVGRNN), we show that \ufb02exible non-Gaussian latent representations can further\nhelp dynamic graph analytic tasks. Our experiments with multiple real-world\ndynamic graph datasets demonstrate that SI-VGRNN and VGRNN consistently\noutperform the existing baseline and state-of-the-art methods by a signi\ufb01cant\nmargin in dynamic link prediction.\n\n1\n\nIntroduction\n\nNode embedding maps each node in a graph to a vector in a low-dimensional latent space, in which\nclassical feature vector-based machine learning formulations can be adopted [5]. Most of the existing\nnode embedding techniques assume that the graph is static and that learning tasks are performed\non \ufb01xed sets of nodes and edges [19, 23, 12, 20, 14, 1]. However, many real-world problems are\nmodeled by dynamic graphs, where graphs are constantly evolving over time. Such graphs have been\ntypically observed in social networks, citation networks, and \ufb01nancial transaction networks. A naive\nsolution to node embedding for dynamic graphs is simply applying static methods to each snapshot of\ndynamic graphs. Among many potential problems of such a naive solution, it is clear that it ignores\nthe temporal dependencies between snapshots.\nSeveral node embedding methods have been proposed to capture the temporal graph evolution for\nboth networks without attributes [10, 26] and attributed networks [24, 16]. However, all of the\nexisting dynamic graph embedding approaches represent each node by a deterministic vector in\na low-dimensional space [2]. Such deterministic representations lack the capability of modeling\nuncertainty of node embedding, which is a natural consideration when having multiple information\nsources, i.e. node attributes and graph structure.\nIn this paper, we propose a novel node embedding method for dynamic graphs that maps each node to\na random vector in the latent space. More speci\ufb01cally, we \ufb01rst introduce a dynamic graph autoencoder\nmodel, namely graph recurrent neural network (GRNN), by extending the use of graph convolutional\n\n\u2217Both authors contributed equally.\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fneural networks (GCRN) [21] to dynamic graphs. Then, we argue that GRNN lacks the expressive\npower for fully capturing the complex dependencies between topological evolution and time-varying\nnode attributes because the output probability in standard RNNs is limited to either a simple unimodal\ndistribution or a mixture of unimodal distributions [3, 22, 6, 8]. Next, to increase the expressive\npower of GRNN in addition to modeling the uncertainty of node latent representations, we propose\nvariational graph recurrent neural network (VGRNN) by adopting high-level latent random variables\nin GRNN. Our proposed VGRNN is capable of learning interpretable latent representation as well as\nbetter modeling of very sparse dynamic graphs.\nTo further boost the expressive power and interpretability of our new VGRNN method, we integrate\nsemi-implicit variational inference [25] with VGRNN. We show that semi-implicit variational graph\nrecurrent neural network (SI-VGRNN) is capable of inferring more \ufb02exible and complex posteriors.\nOur experiments demonstrate the superior performance of VGRNN and SI-VGRNN in dynamic link\nprediction tasks in several real-world dynamic graph datasets compared to baseline and state-of-the-art\nmethods.\n\n2 Background\n\nGraph convolutional recurrent networks (GCRN). GCRN was introduced by Seo et al. [21]\nto model time series data de\ufb01ned over nodes of a static graph. Series of frames in videos and\nspatio-temporal measurements on a network of sensors are two examples of such datasets. GCRN\ncombines graph convolutional networks (GCN) [4] with recurrent neural networks (RNN) to capture\nspatial and temporal patterns in data. More precisely, given a graph G with N nodes, whose\ntopology is determined by the adjacency matrix A \u2208 RN\u00d7N , and a sequence of node attributes\nX = {X(1), X(2), . . . , X(T )}, GCRN reads M-dimensional node attributes X(t) \u2208 RN\u00d7M and\nupdates its hidden state ht \u2208 Rp at each time step t:\n\n(cid:16)\n\n(cid:17)\n\nht = f\n\nA, X(t), ht\u22121\n\n.\n\n(1)\n\nHere f is a non-probabilistic deep neural network. It can be any recursive network including gated\nactivation functions such as long short-term memory (LSTM) or gated recurrent units (GRU), where\nthe deep layers inside them are replaced by graph convolutional layers. GCRN models node attribute\nsequences by parameterizing a factorization of the joint probability distribution as a product of\nconditional probabilities such that\n\n(cid:16)\n\n(cid:17)\n\nT(cid:89)\n\n(cid:16)\n\n(cid:17)\n\n(cid:16)\n\n(cid:17)\n\nX(1), X(2), . . . , X(T ) | A\n\np\n\n=\n\nX(t) | X(