{"title": "GNNExplainer: Generating Explanations for Graph Neural Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 9244, "page_last": 9255, "abstract": "Graph Neural Networks (GNNs) are a powerful tool for machine learning on graphs.GNNs combine node feature information with the graph structure by \nrecursively passing neural messages along edges of the input graph. However, incorporating both graph structure and feature information leads to complex models,\nand explaining predictions made by GNNs remains unsolved. Here\nwe propose GNNExplainer, the first general, model-agnostic approach for providing interpretable explanations for predictions of any GNN-based model on any graph-based machine learning task. Given an instance, GNNExplainer identifies a compact subgraph structure and a small subset of node features that have a crucial role in GNN's prediction. \nFurther, GNNExplainer can generate consistent and concise explanations for an entire class of instances.\nWe formulate GNNExplainer as an optimization task that maximizes the mutual information between a GNN's prediction and distribution of possible subgraph structures. Experiments on synthetic and real-world graphs show that our approach can identify important graph structures as well as node features, and outperforms baselines by 17.1% on average. GNNExplainer provides a variety of benefits, from the ability to visualize semantically relevant structures to interpretability, to giving insights into errors of faulty GNNs.", "full_text": "GNNExplainer: Generating Explanations\n\nfor Graph Neural Networks\n\nRex Ying\u2020\n\nDylan Bourgeois\u2020,\u2021\n\nJure Leskovec\u2020\n\nMarinka Zitnik\u2020\n\u2020Department of Computer Science, Stanford University\n\nJiaxuan You\u2020\n\n\u2021Robust.AI\n\n{rexying, dtsbourg, jiaxuan, marinka, jure}@cs.stanford.edu\n\nAbstract\n\nGraph Neural Networks (GNNs) are a powerful tool for machine learning on\ngraphs. GNNs combine node feature information with the graph structure by\nrecursively passing neural messages along edges of the input graph. However, in-\ncorporating both graph structure and feature information leads to complex models\nand explaining predictions made by GNNs remains unsolved. Here we propose\nGNNEXPLAINER, the \ufb01rst general, model-agnostic approach for providing inter-\npretable explanations for predictions of any GNN-based model on any graph-based\nmachine learning task. Given an instance, GNNEXPLAINER identi\ufb01es a compact\nsubgraph structure and a small subset of node features that have a crucial role in\nGNN\u2019s prediction. Further, GNNEXPLAINER can generate consistent and concise\nexplanations for an entire class of instances. We formulate GNNEXPLAINER as an\noptimization task that maximizes the mutual information between a GNN\u2019s predic-\ntion and distribution of possible subgraph structures. Experiments on synthetic and\nreal-world graphs show that our approach can identify important graph structures\nas well as node features, and outperforms alternative baseline approaches by up to\n43.0% in explanation accuracy. GNNEXPLAINER provides a variety of bene\ufb01ts,\nfrom the ability to visualize semantically relevant structures to interpretability, to\ngiving insights into errors of faulty GNNs.\n\n1\n\nIntroduction\n\nIn many real-world applications, including social, information, chemical, and biological domains,\ndata can be naturally modeled as graphs [9, 41, 49]. Graphs are powerful data representations but\nare challenging to work with because they require modeling of rich relational information as well\nas node feature information [45, 46]. To address this challenge, Graph Neural Networks (GNNs)\nhave emerged as state-of-the-art for machine learning on graphs, due to their ability to recursively\nincorporate information from neighboring nodes in the graph, naturally capturing both graph structure\nand node features [16, 21, 40, 44].\nDespite their strengths, GNNs lack transparency as they do not easily allow for a human-intelligible\nexplanation of their predictions. Yet, the ability to understand GNN\u2019s predictions is important and\nuseful for several reasons: (i) it can increase trust in the GNN model, (ii) it improves model\u2019s\ntransparency in a growing number of decision-critical applications pertaining to fairness, privacy and\nother safety challenges [11], and (iii) it allows practitioners to get an understanding of the network\ncharacteristics, identify and correct systematic patterns of mistakes made by models before deploying\nthem in the real world.\nWhile currently there are no methods for explaining GNNs, recent approaches for explaining other\ntypes of neural networks have taken one of two main routes. One line of work locally approximates\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fFigure 1: GNNEXPLAINER provides interpretable explanations for predictions made by any GNN model on any\ngraph-based machine learning task. Shown is a hypothetical node classi\ufb01cation task where a GNN model \u03a6 is\ntrained on a social interaction graph to predict future sport activities. Given a trained GNN \u03a6 and a prediction \u02c6yi\n= \u201cBasketball\u201d for person vi, GNNEXPLAINER generates an explanation by identifying a small subgraph of the\ninput graph together with a small subset of node features (shown on the right) that are most in\ufb02uential for \u02c6yi.\nExamining explanation for \u02c6yi, we see that many friends in one part of vi\u2019s social circle enjoy ball games, and so\nthe GNN predicts that vi will like basketball. Similarly, examining explanation for \u02c6yj, we see that vj\u2019s friends\nand friends of his friends enjoy water and beach sports, and so the GNN predicts \u02c6yj = \u201cSailing.\u201d\nmodels with simpler surrogate models, which are then probed for explanations [25, 29, 30]. Other\nmethods carefully examine models for relevant features and \ufb01nd good qualitative interpretations of\nhigh level features [6, 13, 27, 32] or identify in\ufb02uential input instances [23, 38]. However, these\napproaches fall short in their ability to incorporate relational information, the essence of graphs.\nSince this aspect is crucial for the success of machine learning on graphs, any explanation of GNN\u2019s\npredictions should leverage rich relational information provided by the graph as well as node features.\nHere we propose GNNEXPLAINER, an approach for explaining predictions made by GNNs. GNNEX-\nPLAINER takes a trained GNN and its prediction(s), and it returns an explanation in the form of a small\nsubgraph of the input graph together with a small subset of node features that are most in\ufb02uential for\nthe prediction(s) (Figure 1). The approach is model-agnostic and can explain predictions of any GNN\non any machine learning task for graphs, including node classi\ufb01cation, link prediction, and graph\nclassi\ufb01cation. It handles single- as well as multi-instance explanations. In the case of single-instance\nexplanations, GNNEXPLAINER explains a GNN\u2019s prediction for one particular instance (i.e., a node\nlabel, a new link, a graph-level label). In the case of multi-instance explanations, GNNEXPLAINER\nprovides an explanation that consistently explains a set of instances (e.g., nodes of a given class).\nGNNEXPLAINER speci\ufb01es an explanation as a rich subgraph of the entire graph the GNN was\ntrained on, such that the subgraph maximizes the mutual information with GNN\u2019s prediction(s).\nThis is achieved by formulating a mean \ufb01eld variational approximation and learning a real-valued\ngraph mask which selects the important subgraph of the GNN\u2019s computation graph. Simultaneously,\nGNNEXPLAINER also learns a feature mask that masks out unimportant node features (Figure 1).\nWe evaluate GNNEXPLAINER on synthetic as well as real-world graphs. Experiments show that\nGNNEXPLAINER provides consistent and concise explanations of GNN\u2019s predictions. On synthetic\ngraphs with planted network motifs, which play a role in determining node labels, we show that\nGNNEXPLAINER accurately identi\ufb01es the subgraphs/motifs as well as node features that determine\nnode labels outperforming alternative baseline approaches by up to 43.0% in explanation accuracy.\nFurther, using two real-world datasets we show how GNNEXPLAINER can provide important domain\ninsights by robustly identifying important graph structures and node features that in\ufb02uence a GNN\u2019s\npredictions. Speci\ufb01cally, using molecular graphs and social interaction networks, we show that\nGNNEXPLAINER can identify important domain-speci\ufb01c graph structures, such as N O2 chemical\ngroups or ring structures in molecules, and star structures in Reddit threads. Overall, experiments\ndemonstrate that GNNEXPLAINER provides consistent and concise explanations for GNN-based\nmodels for different machine learning tasks on graphs.\n\n2 Related work\n\nAlthough the problem of explaining GNNs is not well-studied, the related problems of interpretability\nand neural debugging received substantial attention in machine learning. At a high level, we can\ngroup those interpretability methods for non-graph neural networks into two main families.\n\n2\n\nGNN model training and predictionsExplaning GNN\u2019s predictionsGNNExplainer\fFigure 2: A. GNN computation graph Gc (green and orange) for making prediction \u02c6y at node v. Some edges\nin Gc form important neural message-passing pathways (green), which allow useful node information to be\npropagated across Gc and aggregated at v for prediction, while other edges do not (orange). However, GNN\nneeds to aggregate important as well as unimportant messages to form a prediction at node v, which can dilute the\nsignal accumulated from v\u2019s neighborhood. The goal of GNNEXPLAINER is to identify a small set of important\nfeatures and pathways (green) that are crucial for prediction. B. In addition to GS (green), GNNEXPLAINER\nidenti\ufb01es what feature dimensions of GS\u2019s nodes are important for prediction by learning a node feature mask.\n\nMethods in the \ufb01rst family formulate simple proxy models of full neural networks. This can be done\nin a model-agnostic way, usually by learning a locally faithful approximation around the prediction,\nfor example through linear models [29] or sets of rules, representing suf\ufb01cient conditions on the\nprediction [3, 25, 47]. Methods in the second family identify important aspects of the computation, for\nexample, through feature gradients [13, 43], backpropagation of neurons\u2019 contributions to the input\nfeatures [6, 31, 32], and counterfactual reasoning [19]. However, the saliency maps [43] produced\nby these methods have been shown to be misleading in some instances [2] and prone to issues like\ngradient saturation [31, 32]. These issues are exacerbated on discrete inputs such as graph adjacency\nmatrices since the gradient values can be very large but only on very small intervals. Because of that,\nsuch approaches are not suitable for explaining predictions made by neural networks on graphs.\nInstead of creating new, inherently interpretable models, post-hoc interpretability methods [1, 14, 15,\n17, 23, 38] consider models as black boxes and then probe them for relevant information. However, no\nwork has been done to leverage relational structures like graphs. The lack of methods for explaining\npredictions on graph-structured data is problematic, as in many cases, predictions on graphs are\ninduced by a complex combination of nodes and paths of edges between them. For example, in some\ntasks, an edge is important only when another alternative path exists in the graph to form a cycle, and\nthose two features, only when considered together, can accurately predict node labels [10, 12]. Their\njoint contribution thus cannot be modeled as a simple linear combinations of individual contributions.\nFinally, recent GNN models augment interpretability via attention mechanisms [28, 33, 34]. However,\nalthough the learned edge attention values can indicate important graph structure, the values are the\nsame for predictions across all nodes. Thus, this contradicts with many applications where an edge is\nessential for predicting the label of one node but not the label of another node. Furthermore, these\napproaches are either limited to speci\ufb01c GNN architectures or cannot explain predictions by jointly\nconsidering both graph structure and node feature information.\n\n3 Formulating explanations for graph neural networks\n\nLet G denote a graph on edges E and nodes V that are associated with d-dimensional node features\nX = {x1, . . . , xn}, xi \u2208 Rd. Without loss of generality, we consider the problem of explaining a\nnode classi\ufb01cation task (see Section 4.4 for other tasks). Let f denote a label function on nodes\nf : V (cid:55)\u2192 {1, . . . , C} that maps every node in V to one of C classes. The GNN model \u03a6 is optimized\non all nodes in the training set and is then used for prediction, i.e., to approximate f on new nodes.\n\n3.1 Background on graph neural networks\n\nAt layer l, the update of GNN model \u03a6 involves three key computations [4, 45, 46]. (1) First, the\nmodel computes neural messages between every pair of nodes. The message for node pair (vi, vj) is a\nfunction MSG of vi\u2019s and vj\u2019s representations hl\u22121\nin the previous layer and of the relation\ni\n, rij). (2) Second, for each node vi, GNN aggregates\nrij between the nodes: ml\n\nij = MSG(hl\u22121\n\nand hl\u22121\n\nj\n\n, hl\u22121\n\nj\n\ni\n\n3\n\n,,,...AB\fi = AGG({ml\n\nmessages from vi\u2019s neighborhood Nvi and calculates an aggregated message Mi via an aggregation\nij|vj \u2208 Nvi}), where Nvi is neighborhood of node vi whose\nmethod AGG [16, 35]: M l\nde\ufb01nition depends on a particular GNN variant. (3) Finally, GNN takes the aggregated message M l\ni\nalong with vi\u2019s representation hl\u22121\nfrom the previous layer, and it non-linearly transforms them to\nobtain vi\u2019s representation hl\n). The \ufb01nal embedding for node vi\ni at layer l: hl\nafter L layers of computation is zi = hL\ni . Our GNNEXPLAINER provides explanations for any GNN\nthat can be formulated in terms of MSG, AGG, and UPDATE computations.\n\ni = UPDATE(M l\n\ni\n\ni , hl\u22121\n\ni\n\n3.2 GNNEXPLAINER: Problem formulation\n\nOur key insight is the observation that the computation graph of node v, which is de\ufb01ned by the\nGNN\u2019s neighborhood-based aggregation (Figure 2), fully determines all the information the GNN\nuses to generate prediction \u02c6y at node v. In particular, v\u2019s computation graph tells the GNN how to\ngenerate v\u2019s embedding z. Let us denote that computation graph by Gc(v), the associated binary\nadjacency matrix by Ac(v) \u2208 {0, 1}n\u00d7n, and the associated feature set by Xc(v) = {xj|vj \u2208 Gc(v)}.\nThe GNN model \u03a6 learns a conditional distribution P\u03a6(Y |Gc, Xc), where Y is a random variable\nrepresenting labels {1, . . . , C}, indicating the probability of nodes belonging to each of C classes.\nA GNN\u2019s prediction is given by \u02c6y = \u03a6(Gc(v), Xc(v)), meaning that it is fully determined by the\nmodel \u03a6, graph structural information Gc(v), and node feature information Xc(v). In effect, this\nobservation implies that we only need to consider graph structure Gc(v) and node features Xc(v)\nto explain \u02c6y (Figure 2A). Formally, GNNEXPLAINER generates explanation for prediction \u02c6y as\nS ), where GS is a small subgraph of the computation graph. XS is the associated feature of\n(GS, X F\nj |vj \u2208\nGS, and X F\nGS}) that are most important for explaining \u02c6y (Figure 2B).\n\nS is a small subset of node features (masked out by the mask F , i.e., X F\n\nS = {xF\n\n4 GNNEXPLAINER\n\nNext we describe our approach GNNEXPLAINER. Given a trained GNN model \u03a6 and a prediction\n(i.e., single-instance explanation, Sections 4.1 and 4.2) or a set of predictions (i.e., multi-instance\nexplanations, Section 4.3), the GNNEXPLAINER will generate an explanation by identifying a\nsubgraph of the computation graph and a subset of node features that are most in\ufb02uential for the\nmodel \u03a6\u2019s prediction. In the case of explaining a set of predictions, GNNEXPLAINER will aggregate\nindividual explanations in the set and automatically summarize it with a prototype. We conclude this\nsection with a discussion on how GNNEXPLAINER can be used for any machine learning task on\ngraphs, including link prediction and graph classi\ufb01cation (Section 4.4).\n\n4.1 Single-instance explanations\nGiven a node v, our goal is to identify a subgraph GS \u2286 Gc and the associated features XS =\n{xj|vj \u2208 GS} that are important for the GNN\u2019s prediction \u02c6y. For now, we assume that XS is a\nsmall subset of d-dimensional node features; we will later discuss how to automatically determine\nwhich dimensions of node features need to be included in explanations (Section 4.2). We formalize\nthe notion of importance using mutual information M I and formulate the GNNEXPLAINER as the\nfollowing optimization framework:\n\nM I (Y, (GS, XS)) = H(Y ) \u2212 H(Y |G = GS, X = XS).\n\n(1)\n\nmax\nGS\n\nFor node v, M I quanti\ufb01es the change in the probability of prediction \u02c6y = \u03a6(Gc, Xc) when v\u2019s\ncomputation graph is limited to explanation subgraph GS and its node features are limited to XS.\nFor example, consider the situation where vj \u2208 Gc(vi), vj (cid:54)= vi. Then, if removing vj from Gc(vi)\nstrongly decreases the probability of prediction \u02c6yi, the node vj is a good counterfactual explanation\nfor the prediction at vi. Similarly, consider the situation where (vj, vk) \u2208 Gc(vi), vj, vk (cid:54)= vi. Then,\nif removing an edge between vj and vk strongly decreases the probability of prediction \u02c6yi then the\nabsence of that edge is a good counterfactual explanation for the prediction at vi.\nExamining Eq. (1), we see that the entropy term H(Y ) is constant because \u03a6 is \ufb01xed for a trained\nGNN. As a result, maximizing mutual information between the predicted label distribution Y and\n\n4\n\n\fexplanation (GS, XS) is equivalent to minimizing conditional entropy H(Y |G = GS, X = XS),\nwhich can be expressed as follows:\n\nH(Y |G = GS, X = XS) = \u2212EY |GS ,XS [log P\u03a6(Y |G = GS, X = XS)] .\n\n(2)\n\nExplanation for prediction \u02c6y is thus a subgraph GS that minimizes uncertainty of \u03a6 when the GNN\ncomputation is limited to GS. In effect, GS maximizes probability of \u02c6y (Figure 2). To obtain a\ncompact explanation, we impose a constraint on GS\u2019s size as: |GS| \u2264 KM , so that GS has at most\nKM nodes. In effect, this implies that GNNEXPLAINER aims to denoise Gc by taking KM edges\nthat give the highest mutual information with the prediction.\nGNNEXPLAINER\u2019s optimization framework. Direct optimization of GNNEXPLAINER\u2019s objective\nis not tractable as Gc has exponentially many subgraphs GS that are candidate explanations for \u02c6y. We\nthus consider a fractional adjacency matrix1 for subgraphs GS, i.e., AS \u2208 [0, 1]n\u00d7n, and enforce the\nsubgraph constraint as: AS[j, k] \u2264 Ac[j, k] for all j, k. This continuous relaxation can be interpreted\nas a variational approximation of distribution of subgraphs of Gc. In particular, if we treat GS \u223c G\nas a random graph variable, the objective in Eq. (2) becomes:\n\nEGS\u223cGH(Y |G = GS, X = XS),\n\nminG\n\nWith convexity assumption, Jensen\u2019s inequality gives the following upper bound:\n\nminG H(Y |G = EG[GS], X = XS).\n\n(3)\n\n(4)\n\n(j,k)\u2208Gc\n\nmultivariate Bernoulli distribution as: PG(GS) =(cid:81)\n\nIn practice, due to the complexity of neural networks, the convexity assumption does not hold.\nHowever, experimentally, we found that minimizing this objective with regularization often leads to a\nlocal minimum corresponding to high-quality explanations.\nTo tractably estimate EG, we use mean-\ufb01eld variational approximation and decompose G into a\nAS[j, k]. This allows us to estimate the\nexpectation with respect to the mean-\ufb01eld approximation, thereby obtaining AS in which (j, k)-th\nentry represents the expectation on whether edge (vj, vk) exists. We observed empirically that this\napproximation together with a regularizer for promoting discreteness [40] converges to good local\nminima despite the non-convexity of GNNs. The conditional entropy in Equation 4 can be optimized\nby replacing the EG[GS] to be optimized by a masking of the computation graph of adjacency matrix,\nAc (cid:12) \u03c3(M ), where M \u2208 Rn\u00d7n denotes the mask that we need to learn, (cid:12) denotes element-wise\nmultiplication, and \u03c3 denotes the sigmoid that maps the mask to [0, 1]n\u00d7n.\nIn some applications, instead of \ufb01nding an explanation in terms of model\u2019s con\ufb01dence, the users care\nmore about \u201cwhy does the trained model predict a certain class label\u201d, or \u201chow to make the trained\nmodel predict a desired class label\u201d. We can modify the conditional entropy objective in Equation 4\nwith a cross entropy objective between the label class and the model prediction2. To answer these\nqueries, a computationally ef\ufb01cient version of GNNEXPLAINER\u2019s objective, which we optimize using\ngradient descent, is as follows:\n\n\u2212 C(cid:88)\n\nc=1\n\nmin\nM\n\n1[y = c] log P\u03a6(Y = y|G = Ac (cid:12) \u03c3(M ), X = Xc),\n\n(5)\n\nThe masking approach is also found in Neural Relational Inference [22], albeit with different\nmotivation and objective. Lastly, we compute the element-wise multiplication of \u03c3(M ) and Ac and\nremove low values in M through thresholding to arrive at the explanation GS for the GNN model\u2019s\nprediction \u02c6y at node v.\n\n4.2 Joint learning of graph structural and node feature information\n\nTo identify what node features are most important for prediction \u02c6y, GNNEXPLAINER learns a feature\nselector F for nodes in explanation GS. Instead of de\ufb01ning XS to consists of all node features, i.e.,\n\n1For typed edges, we de\ufb01ne GS \u2208 [0, 1]Ce\u00d7n\u00d7n where Ce is the number of edge types.\n2The label class is the predicted label class by the GNN model to be explained, when answering \u201cwhy does\nthe trained model predict a certain class label\u201d. \u201chow to make the trained model predict a desired class label\u201d\ncan be answered by using the ground-truth label class.\n\n5\n\n\fXS = {xj|vj \u2208 GS}, GNNEXPLAINER considers X F\nare de\ufb01ned through a binary feature selector F \u2208 {0, 1}d (Figure 2B):\n\nS as a subset of features of nodes in GS, which\n\nS = {xF\nX F\n\nj |vj \u2208 GS},\n\n(6)\nj has node features that are not masked out by F . Explanation (GS, XS) is then jointly\n\nj = [xj,t1, . . . , xj,tk ] for Fti = 1,\nxF\n\nwhere xF\noptimized for maximizing the mutual information objective:\n\nM I (Y, (GS, F )) = H(Y ) \u2212 H(Y |G = GS, X = X F\nS ),\n\nmax\nGS ,F\n\n(7)\n\nrandom variable X we reparametrize X as: X = Z + (XS \u2212 Z) (cid:12) F s.t.(cid:80)\n\nwhich represents a modi\ufb01ed objective function from Eq. (1) that considers structural and node feature\ninformation to generate an explanation for prediction \u02c6y.\nS as XS (cid:12) F , where F acts as a feature mask\nLearning binary feature selector F . We specify X F\nthat we need to learn. Intuitively, if a particular feature is not important, the corresponding weights in\nGNN\u2019s weight matrix take values close to zero. In effect, this implies that masking the feature out\ndoes not decrease predicted probability for \u02c6y. Conversely, if the feature is important then masking it\nout would decrease predicted probability. However, in some cases this approach ignores features that\nare important for prediction but take values close to zero. To address this issue we marginalize over\nall feature subsets and use a Monte Carlo estimate to sample from empirical marginal distribution for\nnodes in XS during training [48]. Further, we use a reparametrization trick [20] to backpropagate\ngradients in Eq. (7) to the feature mask F . In particular, to backpropagate through a d-dimensional\nj Fj \u2264 KF , where Z\nis a d-dimensional random variable sampled from the empirical distribution and KF is a parameter\nrepresenting the maximum number of features to be kept in the explanation.\nIntegrating additional constraints into explanations. To impose further properties on the explana-\ntion we can extend GNNEXPLAINER\u2019s objective function in Eq. (7) with regularization terms. For\nexample, we use element-wise entropy to encourage structural and node feature masks to be discrete.\nFurther, GNNEXPLAINER can encode domain-speci\ufb01c constraints through techniques like Lagrange\nmultiplier of constraints or additional regularization terms. We include a number of regularization\nterms to produce explanations with desired properties. We penalize large size of the explanation by\nadding the sum of all elements of the mask paramters as the regularization term.\nFinally, it is important to note that each explanation must be a valid computation graph. In particular,\nexplanation (GS, XS) needs to allow GNN\u2019s neural messages to \ufb02ow towards node v such that\nGNN can make prediction \u02c6y. Importantly, GNNEXPLAINER automatically provides explanations that\nrepresent valid computation graphs because it optimizes structural masks across entire computation\ngraphs. Even if a disconnected edge is important for neural message-passing, it will not be selected\nfor explanation as it cannot in\ufb02uence GNN\u2019s prediction. In effect, this implies that the explanation\nGS tends to be a small connected subgraph.\n\n4.3 Multi-instance explanations through graph prototypes\n\nThe output of a single-instance explanation (Sections 4.1 and 4.2) is a small subgraph of the input\ngraph and a small subset of associated node features that are most in\ufb02uential for a single prediction.\nTo answer questions like \u201cHow did a GNN predict that a given set of nodes all have label c?\u201d, we\nneed to obtain a global explanation of class c. Our goal here is to provide insight into how the\nidenti\ufb01ed subgraph for a particular node relates to a graph structure that explains an entire class.\nGNNEXPLAINER can provide multi-instance explanations based on graph alignments and prototypes.\nOur approach has two stages:\nFirst, for a given class c (or, any set of predictions that we want to explain), we \ufb01rst choose a\nreference node vc, for example, by computing the mean embedding of all nodes assigned to c. We\nthen take explanation GS(vc) for reference vc and align it to explanations of other nodes assigned to\nclass c. Finding optimal matching of large graphs is challenging in practice. However, the single-\ninstance GNNEXPLAINER generates small graphs (Section 4.2) and thus near-optimal pairwise graph\nmatchings can be ef\ufb01ciently computed.\nSecond, we aggregate aligned adjacency matrices into a graph prototype Aproto using, for example, a\nrobust median-based approach. Prototype Aproto gives insights into graph patterns shared between\nnodes that belong to the same class. One can then study prediction for a particular node by comparing\nexplanation for that node\u2019s prediction (i.e., returned by single-instance explanation approach) to the\nprototype (see Appendix for more information).\n\n6\n\n\f4.4 GNNEXPLAINER model extensions\nAny machine learning task on graphs. In addition to explaining node classi\ufb01cation, GNNEX-\nPLAINER provides explanations for link prediction and graph classi\ufb01cation with no change to its\noptimization algorithm. When predicting a link (vj, vk), GNNEXPLAINER learns two masks XS(vj)\nand XS(vk) for both endpoints of the link. When classifying a graph, the adjacency matrix in Eq. (5)\nis the union of adjacency matrices for all nodes in the graph whose label we want to explain. However,\nnote that in graph classi\ufb01cation, unlike node classi\ufb01cation, due to the aggregation of node embed-\ndings, it is no longer true that the explanation GS is necessarily a connected subgraph. Depending on\napplication, in some scenarios such as chemistry where explanation is a functional group and should\nbe connected, one can extract the largest connected component as the explanation.\nAny GNN model. Modern GNNs are based on message passing architectures on the input graph. The\nmessage passing computation graphs can be composed in many different ways and GNNEXPLAINER\ncan account for all of them. Thus, GNNEXPLAINER can be applied to: Graph Convolutional\nNetworks [21], Gated Graph Sequence Neural Networks [26], Jumping Knowledge Networks [36],\nAttention Networks [33], Graph Networks [4], GNNs with various node aggregation schemes [7, 5, 18,\n16, 40, 39, 35], Line-Graph NNs [8], position-aware GNN [42], and many other GNN architectures.\nComputational complexity. The number of parameters in GNNEXPLAINER\u2019s optimization depends\non the size of computation graph Gc for node v whose prediction we aim to explain. In particular,\nGc(v)\u2019s adjacency matrix Ac(v) is equal to the size of the mask M, which needs to be learned\nby GNNEXPLAINER. However, since computation graphs are typically relatively small, compared\nto the size of exhaustive L-hop neighborhoods (e.g., 2-3 hop neighborhoods [21], sampling-based\nneighborhoods [39], neighborhoods with attention [33]), GNNEXPLAINER can effectively generate\nexplanations even when input graphs are large.\n\n5 Experiments\n\nWe begin by describing the graphs, alternative baseline approaches, and experimental setup. We then\npresent experiments on explaining GNNs for node classi\ufb01cation and graph classi\ufb01cation tasks. Our\nqualitative and quantitative analysis demonstrates that GNNEXPLAINER is accurate and effective in\nidentifying explanations, both in terms of graph structure and node features.\nSynthetic datasets. We construct four kinds of node classi\ufb01cation datasets (Table 1). (1) In BA-\nSHAPES, we start with a base Barab\u00b4asi-Albert (BA) graph on 300 nodes and a set of 80 \ufb01ve-node\n\u201chouse\u201d-structured network motifs, which are attached to randomly selected nodes of the base graph.\nThe resulting graph is further perturbed by adding 0.1N random edges. Nodes are assigned to 4\nclasses based on their structural roles. In a house-structured motif, there are 3 types of roles: the top,\nmiddle and bottom node of the house. Therefore there are 4 different classes, corresponding to nodes\nat the top, middle, bottom of houses, and nodes that do not belong to a house. (2) BA-COMMUNITY\ndataset is a union of two BA-SHAPES graphs. Nodes have normally distributed feature vectors and\nare assigned to one of 8 classes based on their structural roles and community memberships. (3)\nIn TREE-CYCLES, we start with a base 8-level balanced binary tree and 80 six-node cycle motifs,\nwhich are attached to random nodes of the base graph. (4) TREE-GRID is the same as TREE-CYCLES\nexcept that 3-by-3 grid motifs are attached to the base tree graph in place of cycle motifs.\nReal-world datasets. We consider two graph classi\ufb01cation datasets: (1) MUTAG is a dataset of\n4,337 molecule graphs labeled according to their mutagenic effect on the Gram-negative bacterium S.\ntyphimurium [10]. (2) REDDIT-BINARY is a dataset of 2,000 graphs, each representing an online\ndiscussion thread on Reddit. In each graph, nodes are users participating in a thread, and edges\nindicate that one user replied to another user\u2019s comment. Graphs are labeled according to the type of\nuser interactions in the thread: r/IAmA and r/AskReddit contain Question-Answer interactions, while\nr/TrollXChromosomes and r/atheism contain Online-Discussion interactions [37].\nAlternative baseline approaches. Many explainability methods cannot be directly applied to graphs\n(Section 2). Nevertheless, we here consider the following alternative approaches that can provide\ninsights into predictions made by GNNs: (1) GRAD is a gradient-based method. We compute gradient\nof the GNN\u2019s loss function with respect to the adjacency matrix and the associated node features,\nsimilar to a saliency map approach. (2) ATT is a graph attention GNN (GAT) [33] that learns attention\nweights for edges in the computation graph, which we use as a proxy measure of edge importance.\nWhile ATT does consider graph structure, it does not explain using node features and can only explain\nGAT models. Furthermore, in ATT it is not obvious which attention weights need to be used for edge\n\n7\n\n\fTable 1: Illustration of synthetic datasets (refer to \u201cSynthetic datasets\u201d for details) together with performance\nevaluation of GNNEXPLAINER and alternative baseline explainability approaches.\n\nFigure 3: Evaluation of single-instance explanations. A-B. Shown are exemplar explanation subgraphs for node\nclassi\ufb01cation task on four synthetic datasets. Each method provides explanation for the red node\u2019s prediction.\n\nimportance, since a 1-hop neighbor of a node can also be a 2-hop neighbor of the same node due to\ncycles. Each edge\u2019s importance is thus computed as the average attention weight across all layers.\nSetup and implementation details. For each dataset, we \ufb01rst train a single GNN for each dataset,\nand use GRAD and GNNEXPLAINER to explain the predictions made by the GNN. Note that\nthe ATT baseline requires using a graph attention architecture like GAT [33]. We thus train a\nseparate GAT model on the same dataset and use the learned edge attention weights for explanation.\nHyperparameters KM , KF control the size of subgraph and feature explanations respectively, which\nis informed by prior knowledge about the dataset. For synthetic datasets, we set KM to be the\nsize of ground truth. On real-world datasets, we set KM = 10. We set KF = 5 for all datasets.\nWe further \ufb01x our weight regularization hyperparameters across all node and graph classi\ufb01cation\nexperiments. We refer readers to the Appendix for more training details (Code and datasets are\navailable at https://github.com/RexYing/gnn-model-explainer).\nResults. We investigate questions: Does GNNEXPLAINER provide sensible explanations? How\ndo explanations compare to the ground-truth knowledge? How does GNNEXPLAINER perform on\nvarious graph-based prediction tasks? Can it explain predictions made by different GNNs?\n1) Quantitative analyses. Results on node classi\ufb01cation datasets are shown in Table 1. We have\nground-truth explanations for synthetic datasets and we use them to calculate explanation accuracy for\nall explanation methods. Speci\ufb01cally, we formalize the explanation problem as a binary classi\ufb01cation\ntask, where edges in the ground-truth explanation are treated as labels and importance weights given\nby explainability method are viewed as prediction scores. A better explainability method predicts\n\nFigure 4: Evaluation of single-instance explanations. A-B. Shown are exemplar explanation subgraphs for graph\nclassi\ufb01cation task on two datasets, MUTAG and REDDIT-BINARY.\n\n8\n\nBA-ShapesBaseMotifNode FeaturesNoneNonewhere = community IDGraph structureGraph structureNode feature informationExplanationcontentExplanation accuracy AttGradGNNExplainerCommunity 1Community 0BA-CommunityTree-CyclesNoneTree-GridGraph structureGraph structure0.9250.8820.8150.7390.7500.8360.8240.9050.9480.6120.6670.875ABABBA-ShapesBA-CommunityTree-CyclesTree-GridComputation graphGradGround TruthAttComputation graphGradAttGround TruthGNNExplainerGNNExplainerMutagReddit-BinaryOnline-DiscussionMutagReddit-BinaryQuestion-AnswerComputation graphGradGround TruthAttComputation graphGradAttGround TruthGNNExplainerGNNExplainerABTopicreactionsRingstructureNO group2Experts answeringmultiple questions \fFigure 5: Visualization of features that are important\nfor a GNN\u2019s prediction. A. Shown is a representative\nmolecular graph from MUTAG dataset (top). Impor-\ntance of the associated graph features is visualized\nwith a heatmap (bottom). In contrast with baselines,\nGNNEXPLAINER correctly identi\ufb01es features that are\nimportant for predicting the molecule\u2019s mutagenicity,\ni.e. C, O, H, and N atoms. B. Shown is a computation\ngraph of a red node from BA-COMMUNITY dataset\n(top). Again, GNNEXPLAINER successfully identi\ufb01es\nthe node feature that is important for predicting the\nstructural role of the node but baseline methods fail.\n\nhigh scores for edges that are in the ground-truth explanation, and thus achieves higher explanation\naccuracy. Results show that GNNEXPLAINER outperforms alternative approaches by 17.1% on\naverage. Further, GNNEXPLAINER achieves up to 43.0% higher accuracy on the hardest TREE-GRID\ndataset.\n2) Qualitative analyses. Results are shown in Figures 3\u20135. In a topology-based prediction task with\nno node features, e.g. BA-SHAPES and TREE-CYCLES, GNNEXPLAINER correctly identi\ufb01es network\nmotifs that explain node labels, i.e. structural labels (Figure 3). As illustrated in the \ufb01gures, house,\ncycle and tree motifs are identi\ufb01ed by GNNEXPLAINER but not by baseline methods. In Figure 4,\nwe investigate explanations for graph classi\ufb01cation task. In MUTAG example, colors indicate node\nfeatures, which represent atoms (hydrogen H, carbon C, etc). GNNEXPLAINER correctly identi\ufb01es\ncarbon ring as well as chemical groups N H2 and N O2, which are known to be mutagenic [10].\nFurther, in REDDIT-BINARY example, we see that Question-Answer graphs (2nd row in Figure 4B)\nhave 2-3 high degree nodes that simultaneously connect to many low degree nodes, which makes\nsense because in QA threads on Reddit we typically have 2-3 experts who all answer many different\nquestions [24]. Conversely, we observe that discussion patterns commonly exhibit tree-like patterns\n(2nd row in Figure 4A), since a thread on Reddit is usually a reaction to a single topic [24]. On the\nother hand, GRAD and ATT methods give incorrect or incomplete explanations. For example, both\nbaseline methods miss cycle motifs in MUTAG dataset and more complex grid motifs in TREE-GRID\ndataset. Furthermore, although edge attention weights in ATT can be interpreted as importance scores\nfor message passing, the weights are shared across all nodes in input the graph, and as such ATT fails\nto provide high quality single-instance explanations.\nAn essential criterion for explanations is that they must be interpretable, i.e., provide a qualitative\nunderstanding of the relationship between the input nodes and the prediction. Such a requirement\nimplies that explanations should be easy to understand while remaining exhaustive. This means\nthat a GNN explainer should take into account both the structure of the underlying graph as well as\nthe associated features when they are available. Figure 5 shows results of an experiment in which\nGNNEXPLAINER jointly considers structural information as well as information from a small number\nof feature dimensions3. While GNNEXPLAINER indeed highlights a compact feature representation\nin Figure 5, gradient-based approaches struggle to cope with the added noise, giving high importance\nscores to irrelevant feature dimensions.\n\n6 Conclusion\n\nWe present GNNEXPLAINER, a novel method for explaining predictions of any GNN on any graph-\nbased machine learning task without requiring modi\ufb01cation of the underlying GNN architecture or\nre-training. We show how GNNEXPLAINER can leverage recursive neighborhood-aggregation scheme\nof graph neural networks to identify important graph pathways as well as highlight relevant node\nfeature information that is passed along edges of the pathways. While the problem of explainability of\nmachine-learning predictions has received substantial attention in recent literature, our work is unique\nin the sense that it presents an approach that operates on relational structures\u2014graphs with rich\nnode features\u2014and provides a straightforward interface for making sense out of GNN predictions,\ndebugging GNN models, and identifying systematic patterns of mistakes.\n\n3Feature explanations are shown for the two datasets with node features, i.e., MUTAG and BA-COMMUNITY.\n\n9\n\nGraph classification Node classificationABCOClHNFBrSPINaKLiCaNot applicableNot applicableMoleculargraph with nodefeaturesComputationgraph of red node with nodefeaturesInput toGNNNode\u2019s structural roleMolecule\u2019s mutagenicityGNN\u2019sPredictionGround TruthFeature ImportanceGNNExplainerGradAtt\fAcknowledgments\n\nJure Leskovec is a Chan Zuckerberg Biohub investigator. We gratefully acknowledge the support\nof DARPA under FA865018C7880 (ASED) and MSC; NIH under No. U54EB020405 (Mobilize);\nARO under No. 38796-Z8424103 (MURI); IARPA under No. 2017-17071900005 (HFC), NSF\nunder No. OAC-1835598 (CINES) and HDR; Stanford Data Science Initiative, Chan Zuckerberg\nBiohub, JD.com, Amazon, Boeing, Docomo, Huawei, Hitachi, Observe, Siemens, UST Global.\nThe U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes\nnotwithstanding any copyright notation thereon. Any opinions, \ufb01ndings, and conclusions or rec-\nommendations expressed in this material are those of the authors and do not necessarily re\ufb02ect the\nviews, policies, or endorsements, either expressed or implied, of DARPA, NIH, ONR, or the U.S.\nGovernment.\n\nReferences\n[1] A. Adadi and M. Berrada. Peeking Inside the Black-Box: A Survey on Explainable Arti\ufb01cial\n\nIntelligence (XAI). IEEE Access, 6:52138\u201352160, 2018.\n\n[2] J. Adebayo, J. Gilmer, M. Muelly, I. Goodfellow, M. Hardt, and B. Kim. Sanity checks for\n\nsaliency maps. In NeurIPS, 2018.\n\n[3] M. Gethsiyal Augasta and T. Kathirvalavakumar. Reverse Engineering the Neural Networks for\nRule Extraction in Classi\ufb01cation Problems. Neural Processing Letters, 35(2):131\u2013150, April\n2012.\n\n[4] Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zam-\nbaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner,\net al. Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261, 2018.\n\n[5] J. Chen, J. Zhu, and L. Song. Stochastic training of graph convolutional networks with variance\n\nreduction. In ICML, 2018.\n\n[6] Jianbo Chen, Le Song, Martin J Wainwright, and Michael I Jordan. Learning to explain: An\ninformation-theoretic perspective on model interpretation. arXiv preprint arXiv:1802.07814,\n2018.\n\n[7] Jie Chen, Tengfei Ma, and Cao Xiao. Fastgcn: fast learning with graph convolutional networks\n\nvia importance sampling. In ICLR, 2018.\n\n[8] Z. Chen, L. Li, and J. Bruna. Supervised community detection with line graph neural networks.\n\nIn ICLR, 2019.\n\n[9] E. Cho, S. Myers, and J. Leskovec. Friendship and mobility: user movement in location-based\n\nsocial networks. In KDD, 2011.\n\n[10] A. Debnath et al. Structure-activity relationship of mutagenic aromatic and heteroaromatic\nnitro compounds. correlation with molecular orbital energies and hydrophobicity. Journal of\nMedicinal Chemistry, 34(2):786\u2013797, 1991.\n\n[11] F. Doshi-Velez and B. Kim. Towards A Rigorous Science of Interpretable Machine Learning.\n\n2017. arXiv: 1702.08608.\n\n[12] D. Duvenaud et al. Convolutional networks on graphs for learning molecular \ufb01ngerprints. In\n\nNIPS, 2015.\n\n[13] D. Erhan, Y. Bengio, A. Courville, and P. Vincent. Visualizing higher-layer features of a deep\n\nnetwork. University of Montreal, 1341(3):1, 2009.\n\n[14] A. Fisher, C. Rudin, and F. Dominici. All Models are Wrong but many are Useful: Variable\nImportance for Black-Box, Proprietary, or Misspeci\ufb01ed Prediction Models, using Model Class\nReliance. January 2018. arXiv: 1801.01489.\n\n[15] R. Guidotti et al. A Survey of Methods for Explaining Black Box Models. ACM Comput. Surv.,\n\n51(5):93:1\u201393:42, 2018.\n\n10\n\n\f[16] W. Hamilton, Z. Ying, and J. Leskovec. Inductive representation learning on large graphs. In\n\nNIPS, 2017.\n\n[17] G. Hooker. Discovering additive structure in black box functions. In KDD, 2004.\n\n[18] W.B. Huang, T. Zhang, Y. Rong, and J. Huang. Adaptive sampling towards fast graph represen-\n\ntation learning. In NeurIPS, 2018.\n\n[19] Bo Kang, Jefrey Lijf\ufb01jt, and Tijl De Bie. Explaine: An approach for explaining network\n\nembedding-based link predictions. arXiv:1904.12694, 2019.\n\n[20] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. In NeurIPS, 2013.\n\n[21] T. N. Kipf and M. Welling. Semi-supervised classi\ufb01cation with graph convolutional networks.\n\nIn ICLR, 2016.\n\n[22] Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, and Richard Zemel. Neural\n\nrelational inference for interacting systems. In ICML, 2018.\n\n[23] P. W. Koh and P. Liang. Understanding black-box predictions via in\ufb02uence functions. In ICML,\n\n2017.\n\n[24] Srijan Kumar, William L Hamilton, Jure Leskovec, and Dan Jurafsky. Community interaction\n\nand con\ufb02ict on the web. In WWW, pages 933\u2013943, 2018.\n\n[25] H. Lakkaraju, E. Kamar, R. Caruana, and J. Leskovec. Interpretable & Explorable Approxima-\n\ntions of Black Box Models, 2017.\n\n[26] Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel. Gated graph sequence neural networks.\n\narXiv:1511.05493, 2015.\n\n[27] S. Lundberg and Su-In Lee. A Uni\ufb01ed Approach to Interpreting Model Predictions. In NIPS,\n\n2017.\n\n[28] D. Neil et al. Interpretable Graph Convolutional Neural Networks for Inference on Noisy\n\nKnowledge Graphs. In ML4H Workshop at NeurIPS, 2018.\n\n[29] M. Ribeiro, S. Singh, and C. Guestrin. Why should i trust you?: Explaining the predictions of\n\nany classi\ufb01er. In KDD, 2016.\n\n[30] G. J. Schmitz, C. Aldrich, and F. S. Gouws. ANN-DT: an algorithm for extraction of decision\n\ntrees from arti\ufb01cial neural networks. IEEE Transactions on Neural Networks, 1999.\n\n[31] A. Shrikumar, P. Greenside, and A. Kundaje. Learning Important Features Through Propagating\n\nActivation Differences. In ICML, 2017.\n\n[32] M. Sundararajan, A. Taly, and Q. Yan. Axiomatic Attribution for Deep Networks. In ICML,\n\n2017.\n\n[33] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Li`o, and Y. Bengio. Graph attention\n\nnetworks. In ICLR, 2018.\n\n[34] T. Xie and J. Grossman. Crystal graph convolutional neural networks for an accurate and\n\ninterpretable prediction of material properties. In Phys. Rev. Lett., 2018.\n\n[35] K. Xu, W. Hu, J. Leskovec, and S. Jegelka. How powerful are graph neural networks? In ICRL,\n\n2019.\n\n[36] K. Xu, C. Li, Y. Tian, T. Sonobe, K. Kawarabayashi, and S. Jegelka. Representation learning on\n\ngraphs with jumping knowledge networks. In ICML, 2018.\n\n[37] Pinar Yanardag and SVN Vishwanathan. Deep graph kernels. In KDD, pages 1365\u20131374. ACM,\n\n2015.\n\n[38] C. Yeh, J. Kim, I. Yen, and P. Ravikumar. Representer point selection for explaining deep neural\n\nnetworks. In NeurIPS, 2018.\n\n11\n\n\f[39] R. Ying, R. He, K. Chen, P. Eksombatchai, W. Hamilton, and J. Leskovec. Graph convolutional\n\nneural networks for web-scale recommender systems. In KDD, 2018.\n\n[40] Z. Ying, J. You, C. Morris, X. Ren, W. Hamilton, and J. Leskovec. Hierarchical graph\n\nrepresentation learning with differentiable pooling. In NeurIPS, 2018.\n\n[41] J. You, B. Liu, R. Ying, V. Pande, and J. Leskovec. Graph convolutional policy network for\n\ngoal-directed molecular graph generation. 2018.\n\n[42] J. You, Rex Ying, and J. Leskovec. Position-aware graph neural networks. In ICML, 2019.\n\n[43] M. Zeiler and R. Fergus. Visualizing and Understanding Convolutional Networks. In ECCV.\n\n2014.\n\n[44] M. Zhang and Y. Chen. Link prediction based on graph neural networks. In NIPS, 2018.\n\n[45] Z. Zhang, Peng C., and W. Zhu. Deep Learning on Graphs: A Survey. arXiv:1812.04202, 2018.\n\n[46] J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu, and M. Sun. Graph Neural Networks: A Review of\n\nMethods and Applications. arXiv:1812.08434, 2018.\n\n[47] J. Zilke, E. Loza Mencia, and F. Janssen. DeepRED - Rule Extraction from Deep Neural\n\nNetworks. In Discovery Science. Springer International Publishing, 2016.\n\n[48] L. Zintgraf, T. Cohen, T. Adel, and M. Welling. Visualizing deep neural network decisions:\n\nPrediction difference analysis. In ICLR, 2017.\n\n[49] M. Zitnik, M. Agrawal, and J. Leskovec. Modeling polypharmacy side effects with graph\n\nconvolutional networks. Bioinformatics, 34, 2018.\n\n12\n\n\f", "award": [], "sourceid": 4956, "authors": [{"given_name": "Zhitao", "family_name": "Ying", "institution": "Stanford University"}, {"given_name": "Dylan", "family_name": "Bourgeois", "institution": "EPFL"}, {"given_name": "Jiaxuan", "family_name": "You", "institution": "Stanford University"}, {"given_name": "Marinka", "family_name": "Zitnik", "institution": "Stanford University"}, {"given_name": "Jure", "family_name": "Leskovec", "institution": "Stanford University and Pinterest"}]}