{"title": "Inferring Networks From Random Walk-Based Node Similarities", "book": "Advances in Neural Information Processing Systems", "page_first": 3704, "page_last": 3715, "abstract": "Digital presence in the world of online social media entails significant privacy risks. In this work we consider a privacy threat to a social network in which an attacker has access to a subset of random walk-based node similarities, such as effective resistances (i.e., commute times) or personalized PageRank scores. Using these similarities, the attacker seeks to infer as much information as possible about the network, including unknown pairwise node similarities and edges.\n\nFor the effective resistance metric, we show that with just a small subset of measurements, one can learn a large fraction of edges in a social network. We also show that it is possible to learn a graph which accurately matches the underlying network on all other effective resistances. This second observation is interesting from a data mining perspective, since it can be expensive to compute all effective resistances or other random walk-based similarities. As an alternative, our graphs learned from just a subset of effective resistances can be used as surrogates in a range of applications that use effective resistances to probe graph structure, including for graph clustering, node centrality evaluation, and anomaly detection. \n\nWe obtain our results by formalizing the graph learning objective mathematically, using two optimization problems. One formulation is convex and can be solved provably in polynomial time. The other is not, but we solve it efficiently with projected gradient and coordinate descent. We demonstrate the effectiveness of these methods on a number of social networks obtained from Facebook. We also discuss how our methods can be generalized to other random walk-based similarities, such as personalized PageRank scores. Our code is available at https://github.com/cnmusco/graph-similarity-learning.", "full_text": "Inferring Networks From Random Walk-Based\n\nNode Similarities\n\nJeremy G. Hoskins\n\nDepartment of Mathematics\n\nYale University\nNew Haven, CT\n\njeremy.hoskins@yale.edu\n\nChristopher Musco\n\nDepartment of Computer Science\n\nPrinceton University\n\nPrinceton, NJ\n\ncmusco@cs.princeton.edu\n\nCameron Musco\nMicrosoft Research\n\nCambridge, MA\n\ncamusco@microsoft.com\n\nCharalampos E. Tsourakakis\nDepartment of Computer Science\n\nBoston University & Harvard University\n\nBoston, MA\n\nctsourak@bu.edu\n\nAbstract\n\nDigital presence in the world of online social media entails signi\ufb01cant privacy risks\n[31, 56]. In this work we consider a privacy threat to a social network in which an\nattacker has access to a subset of random walk-based node similarities, such as\neffective resistances (i.e., commute times) or personalized PageRank scores. Using\nthese similarities, the attacker seeks to infer as much information as possible about\nthe network, including unknown pairwise node similarities and edges.\nFor the effective resistance metric, we show that with just a small subset of mea-\nsurements, one can learn a large fraction of edges in a social network. We also show\nthat it is possible to learn a graph which accurately matches the underlying network\non all other effective resistances. This second observation is interesting from a data\nmining perspective, since it can be expensive to compute all effective resistances\nor other random walk-based similarities. As an alternative, our graphs learned\nfrom just a subset of effective resistances can be used as surrogates in a range of\napplications that use effective resistances to probe graph structure, including for\ngraph clustering, node centrality evaluation, and anomaly detection.\nWe obtain our results by formalizing the graph learning objective mathematically,\nusing two optimization problems. One formulation is convex and can be solved\nprovably in polynomial time. The other is not, but we solve it ef\ufb01ciently with\nprojected gradient and coordinate descent. We demonstrate the effectiveness of\nthese methods on a number of social networks obtained from Facebook. We\nalso discuss how our methods can be generalized to other random walk-based\nsimilarities, such as personalized PageRank scores. Our code is available at\nhttps://github.com/cnmusco/graph-similarity-learning.\n\n1\n\nIntroduction\n\nIn graph mining and social network science, a variety of measures are used to quantify the similarity\nbetween nodes in a graph, including the shortest path distance, Jaccard\u2019s coef\ufb01cient between node\nneighborhoods, the Adamic-Adar coef\ufb01cient [2], and hub-authority-based metrics [30, 9].\nAn important family of similarity measures are based on random walks, including SimRank [23],\nrandom walks with restarts [50], commute times [18], personalized PageRank [39, 24], and DeepWalk\nembeddings [40]. These measures capture both local and global graph structure and hence are widely\nused in graph clustering and community detection [4, 44], anomaly detection [42], collaborative\n\ufb01ltering [18, 45, 55], link prediction [35], computer vision [20], and many other applications.\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\fIn this work we focus on these random walk-based similarity metrics. We initiate the study of a\nfundamental question:\n\nHow much information about a network can be learned given access to a subset of\npotentially noisy estimates of pairwise node similarities?\n\nThis question is important from a privacy perspective. A common privacy breach is social link\ndisclosure [6, 56], in which an attacker attempts to learn potentially sensitive links between nodes in\na network. Such attacks are very common; fake accounts are used to in\ufb01ltrate social groups, potential\nemployers may want to inspect a job candidate\u2019s social network, and advertisers may wish to probe\na user\u2019s information to offer targeted ads. Thus, studying the ability of an attacker to reveal link\ninformation using node similarities is important in understanding the privacy implications of releasing\nsimilarities, or information that can be used to compute them.\nThere are many scenarios in which node similarities may be released, either directly or indirectly,\nwith the potential to reveal private link information. For example, when searching for users on a\nsocial network platform, node similarity is indirectly revealed since similar users (in terms of social\nconnections) are often displayed together in search results. As a second example, random walk-based\ngraph embeddings (e.g. PageRank or DeepWalk embeddings) may be released publicly for research\npurposes since, naively, they appear to contain no identifying information.\nFrom a data mining perspective, computing all pairwise node similarities can be infeasible for large\nnetworks since the number of similarities grows quadratically in the number of nodes. Additionally,\nwhen the network cannot be accessed in full but can only be probed via crawling [28], we may\nonly have access to similarity estimates rather than their exact values. Thus, understanding what\ninformation can be learned from a partial, potentially noisy, set of node similarities is important when\nusing these metrics in large scale graph mining.\nFinally, in some scenarios, it may be possible to measure node similarities for an underlying graph,\nwhich we cannot directly access but wish to recover. For example, in evolutionary ecology, effective\nresistance distances in planar \u201cenvironment graphs\u201d have been shown to correlate with genetic\ndifferentiation in geographically distributed populations [36, 37, 41]. In this context, measurements\nof geographic genetic variation give incomplete and noisy measurements of effective resistances.\nRecovering an underlying graph from these measurements corresponds to recovering plausible\nlimitations on migration and movement that could have led to the observed genetic variations.\n\n1.1 Learning from Effective Resistances\n\nIn this paper, we focus on commute times, which are one of the most widely used random walk-based\nsimilarities. Commute times are a scaled version of effective resistances, they form a metric, and\nhave many algorithmic applications [47]. Our ideas can be extended to related similarity measures,\nsuch as personalized PageRank, which we discuss in Appendix E. It was shown in the seminal work\nof Liben-Nowell and Kleinberg that effective resistances can be used to predict a signi\ufb01cant fraction\nof future links appearing in networks from existing links, typically ranging from 5% up to 33% [35].\nA dif\ufb01culty associated with this task is that, in contrast to local similarity measures such as the\nnumber of common neighbors or the Adamic-Adar coef\ufb01cient [2], node similarity under the effective\nresistance metric does not necessarily imply local connectivity. For example, two nodes connected by\nmany long paths may be more similar than two nodes directly connected by a single edge.\nFurthermore, in certain cases, the effective resistance between two nodes u, v tends to correlate\nwell with a simple function of the degree sequence (speci\ufb01cally,\nd(v)) [52, 53], and it is\nknown that there are many graphs with the same degree sequence but very different global structures.\nNevertheless, considered in aggregate, effective resistances encode global structure in a very strong\nway. For any graph, given all pairs effective resistances, it is possible to provably recover the full\ngraph in polynomial time [46, 54]! This contrasts with purely local similarity metrics, which can be\nused heuristically for link prediction, but do not give network reconstruction in general. For instance,\nall-pairwise counts of common neighbors in any triangle free graph equal 0, and thus they reveal no\ninformation about graph structure.\nWhile the full information case is well understood, when all exact effective resistances are not\navailable, little is known about what graph information can be learned. Some work considers\n\n1\nd(u) + 1\n\n2\n\n\freconstruction of trees based on a subset of effective resistances [15, 7, 48]. However outside of this\nspecial case, essentially nothing is known.\n\nRelated Work. Our work is closely related to work on link prediction, graph reconstruction, and\nphylogenetic tree reconstruction from pairwise distances. We give an overview in Appendix A.\n\n1.2 Our Contributions\n\nWe study theoretically and empirically what can be learned about a graph given a subset of potentially\nnoisy effective resistance estimates. Our main contributions are:\n\nMathematical formulation. We provide an optimization-based formulation of the problem of learn-\ning a graph from effective resistances. Speci\ufb01cally, given a set of effective resistance measurements,\nwe seek a graph whose effective resistances match the given resistances as closely as possible.\nIn general, there may be many different graphs which match any subset of all pairs effective resis-\ntances, and hence many minimizers to our optimization problem. If the resistances additionally have\nsome noise, there may be no graph which matches them exactly but many which match them approx-\nimately. Nevertheless, as we show empirically, the graph obtained via our optimization approach\ntypically recovers signi\ufb01cant information about the underlying graph, including a large fraction of its\nedges, its global structure, and good approximations to all of its effective resistances.\n\nAlgorithms. We prove that, in some cases, the optimization problem we present can be solved\nexactly, in polynomial time. However, in general, the problem is non-convex and does not admit an\nobvious polynomial time solution. We show that it can be solved via iterative methods along with a\npowerful initialization strategy that allows us to \ufb01nd high quality solutions in most instances.\nWe also show that the problem can be relaxed to a convex formulation. Instead of searching for a\ngraph that matches all given effective resistance measurements, we just \ufb01nd a graph whose effective\nresistances are upper bounded by those given and which has minimum total edge weight. This\nmodi\ufb01ed problem is convex and can be solved via an SDP.\n\nExperimental Results. We evaluate our algorithms on synthetic graphs and real Facebook ego\nnetworks, which contain all nodes in the social circle of a user. Ego networks are important in many\napplications and allow us to effectively test our ability to recover local graph structure. We show that,\ngiven a small randomly selected fraction of all effective resistance pairs (10%-25%), we can learn a\nlarge fraction of a network \u2013 typically between 20% and 60% of edges, even after adding noise to the\ngiven effective resistances.\nWe also show that by \ufb01nding a graph which closely matches the given set of effective resistances (via\nour optimization approach), we in fact \ufb01nd a graph which closely matches the underlying network on\nall effective resistance pairs. This indicates that signi\ufb01cant information contained in all pairs effective\nresistances can be learned from just a small subset of these pairs, even when corrupted by noise.\n\n2 Proposed Method\n\n2.1 Notation and Preliminaries\nFor an undirected, weighted graph G = (V, E, w) with n nodes, we let A be the n \u21e5 n adjacency\nmatrix. L denotes the graph Laplacian: L = D A, where D is a diagonal matrix with Di,i equal to\nthe weighted degree of node i. For an integer n > 0, [n] denotes the set {1, 2, ..., n}. ei denotes the\nith standard basis vector. For a matrix M, Mi,j denotes the entry in its ith row and jth column.\nCommute time and effective resistance. For two nodes u, v 2 V , the hitting time hG(u, v) is the\nexpected time it takes a random walk to travel from u to v. The commute time is its symmetrized\nversion cG(u, v) = hG(u, v) + hG(v, u), i.e., the time to move from u to v and then back to\nu. For connected graphs, the effective resistance between u, v is a scaling of the commute time:\nrG(u, v) = cG(u,v)\ninterpretation. When G is viewed as an electrical network on n nodes where each edge e corresponds\nto a link of conductance we (equivalently to a resistor of resistance 1\n), the effective resistance is the\nwe\n\nvol(G) where vol(G) = 2Pe2E we. Effective resistance has a natural electrical\n\n3\n\n\fvoltage difference that appears across u, v when a unit current source is applied to them. Effective\nresistances (and hence commute times) always form a metric [29].\nLet u,v = eu ev. The effective resistance between nodes u and v in a graph G with Laplacian L is\n(1)\n\nrG(u, v) = T\n\nu,vL+u,v.\n\nHere L+ denotes the Moore-Penrose pseudoinverse of L.\n\n2.2 Problem De\ufb01nition\nWe begin by providing a mathematical formulation of the problem introduced in Section 1 \u2013 that of\nlearning the structure of a graph from partial and possibly noisy measurements of pairwise effective\nresistances. An analogous problem can be de\ufb01ned for other random walk-based similarities, such as\npersonalized PageRank. We discuss initial results in this direction in supplementary Appendix E.\nProblem 1 (Graph Reconstruction From Effective Resistances). Reconstruct an unknown graph\nG given a set of noisy effective resistance measurements,\n\nfor each (u, v) 2S , where S\u2713 [n] \u21e5 [n] is a set of node pairs and nuv is a random noise term.\n\n\u00afr(u, v) = rG(u, v) + nuv\n\nWe focus on three interesting cases of Problem 1:\nProblem 1.1 S = [n] \u21e5 [n] and nuv = 0 for all (u, v) 2 S. This is the full information setting.\nProblem 1.2 S is a subset of [n] \u21e5 [n] and nuv = 0 for all (u, v) 2S . In this setting we must learn\nProblem 1.3 S is a subset of [n] \u21e5 [n] and nuv is a random term, e.g. a mean 0 normal random\n\nG from a limited number of exact effective resistances.\n\nvariable with variance 2: nuv \u21e0N (0, 2).\n\nIt is known that there in a unique graph consistent with any full set of effective resistance measure-\nments (see e.g., [46, 54]). Additionally, this graph can be computed by solving a fully determined\nlinear system. So, we can solve Problem 1.1 exactly in polynomial time (see Section 3.1).\nFrom a privacy and data mining perspective, the limited information settings of Problems 1.2 and 1.3\nare more interesting. In Section 3.1 we show that, when G is a tree, exact recovery is possible for\nProblem 1.2 when S is a superset of G\u2019s edges. However, in general, there is no closed form solution\nto these problems, and exact recovery of G is typically impossible \u2013 several graphs may be consistent\nwith the measurements given. We address these cases by reposing Problem 1 as an optimization\nproblem, in which we attempt to recover a graph matching the given resistances as best as possible.\n\n2.3 Optimization Formulation\nA natural formalization of Problem 1 is as a least squares problem.\n\nProblem 2. Given a set of vertex pairs S\u2713 [n] \u21e5 [n] and a target effective resistance \u00afr(u, v)\nfor each (u, v) 2S :\n\nminimize\ngraph H\n\nF (H) def= X(u,v)2S\n\n[rH(u, v) \u00afr(u, v)]2 .\n\n(2)\n\nUsing formula (1) for effective resistances, Problem 2 can equivalently be viewed as an optimization\n\nproblem over the set of graph Laplacians: minimizeP(u,v)2S\u21e5T\n\nthis set is convex, the objective function is not and it is unclear if it can be minimized provably\nin polynomial time. Nevertheless, we show in Section 3.2 that it is possible to solve the problem\napproximately by combining projected gradient and coordinate descent algorithms with a powerful\ninitialization heuristic. This approach quickly converges to near global minimums for many networks.\nFor Problem 1.2, where \u00afr(u, v) comprise a subset of the exact effective resistances for some graph G,\nminH F (H) = 0. This minimum may be achieved by multiple graphs (including G) if S does not\n\nu,vL+u,v \u00afr(u, v)\u21e42. While\n\n4\n\n\fcontain all effective resistance pairs. Nevertheless, we show experimentally in Section 4 that even\nwhen S contains a small fraction of these pairs, an approximate solution to Problem 2 often recovers\nsigni\ufb01cant information about G, including a large fraction of its edges. Interestingly, while Problem 2\nonly minimizes over the subset S, the recovered graph typically matches G on all effective resistances,\nexplaining why it contains so much structural information. For Problem 1.3, if S = [n] \u21e5 [n] and\nthe noise terms nuv are distributed as i.i.d. Gaussians, Problem 2 gives the maximum likelihood\nestimator for G. We again show that an approximate solution can recover many of G\u2019s edges.\nWe note that while we can solve Problem 2 quickly via iterative methods, we leave open provable\npolynomial time algorithms in the settings of both Problems 1.2 and 1.3.\nConvex relaxation. As an alternative to Problem 2, we give an optimization formulation of Problem\n1 that is convex. Here we optimize over the convex set of graph Laplacians.\n\nProblem 3. Let L be the convex set of n \u21e5 n graph Laplacians. Given a set of vertex pairs\nS\u2713 [n] \u21e5 [n] and a target effective resistance \u00afr(u,v) for every (u, v) 2S ,\n\nminimize\n\nL2L\n\nsubject to\n\nTr(L)\n\nT\nu,vL+u,v \uf8ff \u00afr(u, v) 8(u, v) 2S\n\nBy Rayleigh\u2019s monotonicity law, decreasing the weight on edges in L increases effective resistances.\nTr(L) is equal to the total degree of the graph corresponding to L, so the problem asks us to \ufb01nd a\ngraph with as little total edge weight as possible that still satis\ufb01es the effective resistance constraints.\nThe disadvantage of Problem 3 is that it only encodes the target resistances \u00afr(u, v) as upper bounds\non the resistances of L. The advantage is that we can solve it provably in polynomial time via\nsemide\ufb01nite programming (see supplemental Appendix C). In practice, we \ufb01nd that it can sometimes\neffectively learn graph edges and structure from limited measurements.\nProblem 3 is related to work on convex methods for minimizing total effective resistance or relatedly,\nmixing time in graphs [10, 49, 19]. However, prior work does not consider pairwise resistance\nconstraints and so is not suited to graph learning.\n\n3 Analytical Results and Algorithms\n\n3.1 Full Graph Reconstruction \u2013 Problem 1\nProblem 1 can be solved exactly in polynomial time when S contains all resistance pairs of some\ngraph G (i.e. Problem 1.1). In this case, there is a closed form solution for G\u2019s Laplacian L and\nthe solution is unique. This was pointed out in [46], however we include our own proof in the\nsupplementary Appendix B for completeness.\nTheorem 1 (Lemma 9.4.1. of [46]). If there is a feasible solution to Problem 1.1 , it is unique and can\nbe found in O(n3) time. Let R be a matrix with Ru,v = rG(u, v) for all u, v 2 [n]. The Laplacian L\nof the solution G is\n\n5\n\n2 \u00b7\uf8ff\u2713I \n\nJ\n\nn\u25c6 R\u2713I \n\nJ\n\nn\u25c6+\n\n.\n\n(3)\n\nHere I is the n \u21e5 n identity matrix and J is the n \u21e5 n all ones matrix.\nReconstruction from hitting times. The above immediately generalizes to graph reconstruction\nfrom hitting times since, as discussed, for connected G, the effective resistance between u, v can be\nwritten as rG(u, v) = cG(u,v)\n. Thus, by Theorem 1, we can recover G up to a\nscaling from all pairs hitting times. This recovers a result in [54].\nReconstruction from other similarity measures. An analogous result to Theorem 1 holds for\ngraph recovery from all pairs personalized PageRank scores, and for related measures such as Katz\nsimilarity scores [27]. We discuss this direction in supplementary Appendix E.\nAre all pairs always necessary for perfect reconstruction? For general graphs, Problem 1 can\n\nvol(G) = hG(u,v)+hG(v,u)\n\nvol(G)\n\nonly be solved exactly when S contains alln\n\n2 true effective resistances. However, given additional\n\n\fconstraints on G, recovery is possible with much less information. In particular, we show in Appendix\nB that when G is a tree, we can recover it (i.e., solve Problem 1.2) if S is a superset of its edge set.\nThe problem of recovering trees from pairwise distances is a central problem in phylogenetics.\n\n3.2 Graph Learning via Least Squares Minimization \u2013 Problem 2\nWhen Problem 1 cannot be solved exactly, e.g. in the settings of Problems 1.2 and 1.3, an effective\nsurrogate is to solve Problem 2 to \ufb01nd a graph with effective resistances close to the given target\nresistances. As we demonstrate experimentally in Section 4, this yields good solutions to Problems\n1.2 and 1.3 in many cases. Problem 2 is non-convex, however we show that a good solution can often\nbe found ef\ufb01ciently via projected gradient descent.\n\nOptimizing over edge weights. Let m =n\n2. We write the Laplacian of the graph H as L(w) def=\nBT diag(w)B, where w 2 Rm is a non-negative vector whose entries correspond to the edge weights\nin H, diag(w) is the m \u21e5 m matrix with w as its diagonal, and B 2 Rm\u21e5n is the vertex edge\nincidence matrix with a row equal to u,v = eu ev for every possible edge (u, v) 2 [n] \u21e5 [n].\nOptimizing the objective function F (H) in Problem 2 is equivalent to optimizing F (w) over the edge\nweight vector w, where we de\ufb01ne F (w) = F (H) for the unique H with Laplacian equal to L(w).\nWe restrict wi 0 for all i and project to this constraint after each gradient step by setting wi :=\nmax(wi, 0). The gradient of F (w) can be computed in closed form. We \ufb01rst de\ufb01ne the variable,\nR(w) 2 Rm\u21e5m, whose diagonal contains all effective resistances of H with weight vector w:\nDe\ufb01nition 1. For w 2 Rm with wi 0 for all i, de\ufb01ne R(w) = BL(w)+BT .\nUsing R(w) we can compute the gradient of F (w) by:\nProposition 1. Let denote the Hadamard (entrywise) product for matrices. De\ufb01ne the error vector\n(w) 2 Rm as having (w)i = \u00afr(i) [R(w)]i,i for all i 2S and 0s elsewhere. We have:\n\nrF (w) = 2 (R(w) R(w)) (w).\n\nWe give a proof in Appendix B, along with a formula for the Hessian of F (w).\nAcceleration via coordinate descent. Naively computing the gradient rF (w) via Proposition 1\nrequires computing the full m \u21e5 m matrix R(w), which can be prohibitively expensive for large\ngraphs \u2013 recall that m =n\n2 = O(n2). Note however, that the error vector (w) only has nonzero\nentries at positions corresponding to the node pairs in S. Thus, it suf\ufb01ces to compute just |S| columns\nof R(w) corresponding to these pairs, which can give a signi\ufb01cant savings. We obtain further savings\nusing block coordinate descent. At each step we restrict our updates to a random subset of edges\nB\u2713 [n] \u21e5 [n], and so only form the rows of R(w) corresponding to these edges.\nInitialization. A good initialization can signi\ufb01cantly accelerate the solution of Problem 2. We use a\nstrategy based on the exact solution to Problem 1.1 in Theorem 1.\nSince effective resistances form a metric, by triangle inequality, for any u, v, w 2 [n], rH(u, v) \uf8ff\nrH(u, w) + rH(w, v). Guided by this fact, given targets \u00afr(u, v) for (u, v) 2S , we \ufb01rst \u201c\ufb01ll in\u201d the\nconstraint set. For (w, z) /2S , we set \u00afr(w, z) equal to the shortest path distance in the graph \u00afG which\nhas an edge for each pair in S with length \u00afr(u, v).\nWe thus obtain a full set of target effective resistances. We can form R with Ru,v = \u00afr(u, v) and\ninitialize the Laplacian of H using the formula given in (3) in Theorem 1. However, this formula\nis quite unstable and generally yields an output which is far from a Laplacian even when R is\ncorrupted by a small amount of noise. So we instead compute for some > 0, a regularized estimate,\n. Generally, \u02dcL will not be a valid graph Laplacian, so we\n\nremove any negative edge weights to obtain our initialization.\n\n\u02dcL = 2 \u00b7\u21e5I J\n\nn RI J\n\nn + I\u21e4+\n\n4 Empirical results\n\nWe next present an experimental study of how well our methods can learn a graph given a set of\n(noisy) effective resistance measurements. We focus on two key questions:\n\n6\n\n\f1. Given a set of effective resistance measurements, can we \ufb01nd a graph matching these measurements\nvia the optimization formulations of Problems 2 and 3 and the algorithms of Section 3.2?\n2. What structure does the graph we learn share with the underlying network that produced the\nresistance measurements? Can it be used to uncover links? Does it approximately match the network\non effective resistances outside the measurement set, or share other global structure?\n\n4.1 Experimental Setup\n\nWe address these questions by examining a variety of graphs. We study two synthetic examples:\nan 8 \u21e5 8 grid graph and a k-nearest neighbor graph constructed for vectors drawn from a Gaussian\nmixture model with two clusters. We also consider Facebook \u2018ego networks\u2019 obtained from the\nStanford Network Analysis Project (SNAP) collection [33, 34]. Each of these networks is formed by\ntaking the largest connected component in the social circle of a speci\ufb01c user. Details on the networks\nstudied are shown in Table 2 in supplemental Appendix D.\nFor all experiments, we provide our algorithms with effective resistances uniformly sampled from\n\n2 effective resistances. We sample a \ufb01xed fraction f def=|S|/n\n\nthe set of alln\n2 \u21e5 100% of all\npossible measurements. We typically use f 2{ 10, 25, 50, 100}%. In some cases, these resistances\nare corrupted with i.i.d. Gaussian noise \u2318 \u21e0N (0, 2). We experiment with different values of 2.\nFor Problem 2 we implemented gradient decent based on the closed-from gradient calculation in\nProposition 1. Line search was used to optimize step size at each iteration. For larger problems, block\ncoordinate descent was used as described in Section 3.2, with the coordinate set chosen uniformly at\nrandom in each iteration. We set the block size |B| = 5000. For Problem 3 we used the MOSEK\nconvex optimization software, accessed through the CVX interface [38, 21]. All experiments were\nrun on a laptop with a 2.6 GHz Intel Core i7 processor and 16 GB of main memory.\n\n4.2 Learning Synthetic Graphs\n\nWe \ufb01rst evaluate our algorithms on GRID and K-NN, which are simple graphs with clear structure.\nLeast squares formulation. We \ufb01rst observe that gradient descent effectively minimizes the objective\nfunction of Problem 2 on the GRID and K-NN graphs. We consider the normalized objective for\nconstraints S and output graph H:\n\n.\n\n(4)\n\nbF (H) = P(u,v)2S\n\n[rH (u, v) \u00afr(u, v)]2\nP(u,v)2S\n\n\u00afr(u, v)2\n\nFor noise variance 0, minH bF (H) = 0 and in Figure 2 we see that for GRID we in fact \ufb01nd H with\nbF (H) \u21e1 0 for varying sizes of S. Convergence is faster when 100% of effective resistances are\nincluded in S, but otherwise does not correlate strongly with the number of constraints.\n\nFigure 1: Graphs learned by solving Problem 2 with gradient descent for uniformly sampled effective\nresistances with varying levels of Gaussian noise. Edge width in plots is proportional to edge weight.\n\nIn Figure 2 we also plot the generalization error:\n\nFgen(H) = P(u,v)2[n]\u21e5[n] [rH (u, v) rG(u, v)]2\n\nG(u, v)\n\nP(u,v)2[n]\u21e5[n] r2\n\n7\n\n,\n\n(5)\n\n\fwhere rG(u, v) is the true effective resistance, uncorrupted by noise. Fgen(H) measures how well\nthe graph obtained by solving Problem 2 matches all effective resistances of the original network. We\ncon\ufb01rm that generalization decreases with improved objective function performance, indicating that\noptimizing Problem 2 effectively extracts network structure from a small set of effective resistances.\nWe observe that the generalization error is small even when f = 10%, and becomes negligible as we\nincrease the fraction f of measurements, even in the presence of noise.\n\n0\n\n10\n\n-2\n\n10\n\n-4\n\n10\n\n-6\n\n10\n\n-8\n\n10\n\n-10\n\n10\n\n-12\n\n10\n\n-14\n\n10\n\n0\n\n0\n\n10\n\n-2\n\n10\n\n-4\n\n10\n\n-6\n\n10\n\n-8\n\n10\n\n-10\n\n10\n\n-12\n\n10\n\n-14\n\n10\n\n0\n\n2000\n\n4000\n\n6000\n\n8000\n\n10000\n\n0\n\n10\n\n-1\n\n10\n\n-2\n\n10\n\n-3\n\n10\n\n-4\n\n10\n\n0\n\n0\n\n10\n\n-1\n\n10\n\n-2\n\n10\n\n-3\n\n10\n\n-4\n\n10\n\n0\n\n2000\n\n4000\n\n6000\n\n8000\n\n10000\n\n2000\n\n4000\n\n6000\n\n8000\n\n10000\n\n2000\n\n4000\n\n6000\n\n8000\n\n10000\n\nFigure 2: Objective and generalization error for Problem 2 \u2013 see (5). For details, see Section 4.2.\n\nWe repeat the same experiments with Gaussian noise added to each resistance measurement. The\nvariance of the noise, 2, is scaled relatively to the mean effective resistance in the graph, i.e., we\n2) \u00b7P(u,v)2[n]\u21e5[n] rG(u, v). While generally\nset \u00afr(u, v) = rG(u, v) + N (0, \u00af2) where: \u00af2 = 2\n(n\nminH bF (H) > 0 when \u00afr(u, v) is noisy (since there is no graph consistent with these noisy measure-\n\nments), the objective value and generalization error still decrease steadily with more iterations.\nWe obtain similar results by applying Problem 2 to the K-NN graph (see Figure 4 in supplemental\nAppendix D). Again, gradient descent converges for a variety of noise levels and constraint sets.\nConvergence leads to improved generalization error.\nFigure 1 shows the graphs obtained from solving the problem for varying 2 and f. For both graphs,\nwhen 2 = 0 and f = 100%, the original network is recovered exactly. Reconstruction accuracy\ndecreases with increasing noise and a decreasing number of constraints. For GRID, even with 25% of\nconstraints, nearly full recovery is possible for 2 = 0 and recovery of approximately half of true\nedges is possible for 2 = 0.1. For K-NN, for 2 = 0 and 2 = 0.1 we observe that cluster structure\nis recovered. Detailed quantitative results for both networks are given in Table 1.\n\nConvex formulation. We next evaluate the performance of the convex Problem 3. In this case, we do\nnot focus on convergence as we solve the problem directly using semide\ufb01nite programming. Unlike\nfor Problem 2, solving Problem 3 does not recover the exact input graph, even in the noiseless all\npairs effective resistance case. This is because the input graph does not necessarily minimize the\nobjective \u2013 there can be graphs with smaller total edge weight and lower effective resistances.\nHowever, the learned graphs do capture information about edges in the original: their heaviest edges\ntypically align with true edges in the target graph. We show quantitative results in Table 1 and\nqualitative results for GRID in Figure 3. We mark the 224 heaviest edges in the learned graph in red\nand see that this set converges exactly on the grid.\nThe convex formulation never signi\ufb01cantly outperforms the least squares formulation of Problem 2,\nand often signi\ufb01cantly underperforms. However, we believe there is further opportunity for exploring\nProblem 3, especially given its provable runtime guarantees.\n\n4.3 Learning Social Network Graphs\n\nWe conclude by demonstrating the effectiveness of the least squares formulation of Problem 2 in\nlearning Facebook ego networks from limited effective resistance measurements. We consider three\nmetrics of performance, shown in Table 1 for a number of networks learned from randomly sampled\nsubsets of effective resistances, corrupted with noise.\nObjective Function Value: the value of the normalized objective function (4) of Problem 2.\nGeneralization Error: the error in reconstructing all effective resistances of the true graph (5).\n\n8\n\n\fFigure 3: Graphs learned from solving the convex program in Problem 3 for uniformly sampled\neffective resistances from GRID with varying f, 2. Heaviest edges marked in red.\n\nEdges Learned: the rate of recovery for edges in the true graph. We utilize a standard metric from\nthe link prediction literature [30]: given underlying graph G with m edges and learned graph H, we\nconsider the m heaviest edges of H and compute the percentage of G\u2019s edges contained in this set.\nResults. We \ufb01nd that as for the synthetic GRID and K-NN graphs, we can effectively minimize the\nobjective function of Problem 2 for the Facebook ego networks. Moreover, this minimization leads to\nvery good generalization error in nearly all cases. i.e., we can effectively learn a graph matching our\ninput on all effective resistances, even when we consider just a small subset.\nFor all graphs, we are able to recover a signi\ufb01cant fraction of edges (typically over 20% ), even when\njust considering 10% or 25% of effective resistance pairs. We obtain the best recovery for small\ngraphs, learning over half of the true edges in FB SMALL A and FB SMALL C.\nTypically, the number of edges learned increases as we increase the number of constraints and\ndecrease the noise variance. However, occasionally, considering fewer effective resistances in fact\nimproves learning, possibly because we more effectively solve the underlying optimization problem.\n\nTable 1: Graph recovery results. All results use a randomly sampled subset of f = 10% or 25% of all\neffective resistances. For \u201cAlgorithm\u201d, GD denotes gradient descent. CD denotes block coordinate\ndescent with random batches of size 5000. \u201cNoise level, 2\u201d indicates that the target resistances were\nset to \u00afr(u, v) = rG(u, v) + N (0, 2meanu,v(rG(u, v))). \u201c% Edges baseline\u201d, is the edge density\nof the network, equivalent to the expected edge prediction accuracy achieved with random guessing.\n\nNetwork\n\nAlgorithm\n\n2\n\nGRID\n\nK-NN\n\nFB SMALL A\n\nFB SMALL B\n\nFB SMALL C\n\nFB SMALL D\n\nFB MEDIUM A\nFB MEDIUM B\n\nFB LARGE A\nFB LARGE B\n\nGD\nGD\nSDP\nSDP\nGD\nGD\nSDP\nSDP\nGD\nGD\nGD\nGD\nGD\nGD\nGD\nGD\nGD\nCD\nCD\nCD\nCD\nCD\n\n0\n.1\n0\n.1\n0\n.1\n0\n.1\n0\n.1\n0\n.1\n0\n.1\n0\n.1\n0\n0\n.1\n0\n0\n.1\n\nObjective\n\nfunction error\n\nf =\n10%\n\n.00001\n.00090\n\nna\nna\n\nf =\n25%\n\n.00001\n.00514\n\nna\nna\n\n.00001\n.00197\n\nna\nna\n\n.01345\n.00017\n.00002\n.00032\n.00162\n.00532\n.00335\n.00610\n.00447\n.00224\n.01174\n.00516\n.00524\n.12745\n\n.00002\n.00447\n\nna\nna\n\n.00001\n.00204\n.00003\n.00206\n.00166\n.00644\n.00434\n.18384\n.00665\n.01255\n.03182\n.00796\n.00440\n.34646\n\n9\n\nEffect. resistance\ngeneralization error\nf =\nf =\n10%\n25%\n\n.06559\n.08129\n.08758\n.09549\n.01122\n.05536\n.09314\n.11899\n.21097\n.07964\n.01515\n.02229\n.00217\n.01542\n.00821\n.00923\n.02910\n.00862\n.01687\n.00682\n.00635\n.14532\n\n.00099\n.01336\n.07422\n.09343\n.00117\n.00709\n.10399\n.14097\n.00984\n.01687\n.00623\n.01291\n.00203\n.00218\n.00830\n.21426\n.01713\n.01471\n.03295\n.00862\n.00580\n.36095\n\n% Edges\nlearned\nbaseline\n\n5.56\n\n11.58\n\n28.20\n\n14.59\n\n15.55\n\n11.80\n\n12.78\n4.80\n\n3.41\n9.51\n\n% Edges\nlearned\n\nf =\n10%\n20.54\n25.89\n16.07\n12.50\n44.54\n25.96\n27.05\n24.32\n44.54\n53.64\n42.75\n36.23\n57.03\n52.66\n21.92\n21.38\n23.50\n18.97\n17.50\n10.52\n20.26\n19.43\n\nf =\n25%\n88.39\n50.00\n25.00\n26.79\n72.68\n41.53\n48.36\n39.89\n75.00\n60.00\n48.55\n43.48\n59.51\n57.51\n24.52\n21.20\n25.59\n22.15\n16.03\n12.45\n24.95\n16.97\n\n\fReferences\n[1] Rediet Abebe and Vasileios Nakos. Private link prediction in social networks. 2014.\n[2] Lada A. Adamic and Eytan Adar. Friends and neighbors on the web. Social Networks,\n\n25(3):211\u2013230, 2003.\n\n[3] Mohammad Al Hasan and Mohammed J. Zaki. A survey of link prediction in social networks.\n\nIn Social Network Data Analytics, pages 243\u2013275. Springer, 2011.\n\n[4] Reid Andersen, Fan Chung, and Kevin Lang. Local graph partitioning using PageRank vectors.\n\nIn 47th Annual Symposium on Foundations of Computer Science, 2006.\n\n[5] Dana Angluin and Jiang Chen. Learning a hidden graph using o(logn) queries per edge. Journal\n\nof Computer and System Sciences, 74(4):546\u2013556, 2008.\n\n[6] Lars Backstrom, Cynthia Dwork, and Jon Kleinberg. Wherefore art thou?: Anonymized\nsocial networks, hidden patterns, and structural steganography. In Proceedings of the 16th\nInternational Conference on World Wide Web, pages 181\u2013190. ACM, 2007.\n\n[7] Vladimir Batagelj, Toma\u017e Pisanski, and J. M. S. Sim os Pereira. An algorithm for tree-\nrealizability of distance matrices. International Journal of Computer Mathematics, 34(3-4):171\u2013\n176, 1990.\n\n[8] Anna Ben-Hamou, Roberto I. Oliveira, and Yuval Peres. Estimating graph parameters via ran-\ndom walks with restarts. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms,\n2018.\n\n[9] Vincent D Blondel, Anah\u00ed Gajardo, Maureen Heymans, Pierre Senellart, and Paul Van Dooren.\nA measure of similarity between graph vertices: Applications to synonym extraction and web\nsearching. SIAM Review, 46(4):647\u2013666, 2004.\n\n[10] Stephen Boyd, Persi Diaconis, and Lin Xiao. Fastest mixing Markov chain on a graph. SIAM\n\nReview, 2004.\n\n[11] Rui Castro, Mark Coates, Gang Liang, Robert Nowak, and Bin Yu. Network tomography:\n\nRecent developments. Statistical Science, pages 499\u2013517, 2004.\n\n[12] Ashok K Chandra, Prabhakar Raghavan, Walter L Ruzzo, Roman Smolensky, and Prasoon\nTiwari. The electrical resistance of a graph captures its commute and cover times. Computational\nComplexity, 6(4):312\u2013340, 1996.\n\n[13] Daniel Chen, Leonidas J. Guibas, John Hershberger, and Jian Sun. Road network reconstruction\nfor organizing paths. In Proceedings of the Twenty-First Annual ACM-SIAM Symposium on\nDiscrete Algorithms, pages 1309\u20131320. Society for Industrial and Applied Mathematics, 2010.\n[14] Colin Cooper, Tomasz Radzik, and Yiannis Siantos. Estimating network parameters using\n\nrandom walks. Social Network Analysis and Mining, 4(1):168, 2014.\n\n[15] Joseph C. Culberson and Piotr Rudnicki. A fast algorithm for constructing trees from distance\n\nmatrices. Information Processing Letters, 30(4):215 \u2013 220, 1989.\n\n[16] Richard Desper, Feng Jiang, Olli-P Kallioniemi, Holger Moch, Christos H Papadimitriou,\nand Alejandro A Sch\u00e4ffer. Inferring tree models for oncogenesis from comparative genome\nhybridization data. Journal of Computational Biology, 6(1):37\u201351, 1999.\n\n[17] Joseph Felsenstein. Con\ufb01dence limits on phylogenies: an approach using the bootstrap. Evolu-\n\ntion, 39(4), 1985.\n\n[18] Francois Fouss, Alain Pirotte, Jean-Michel Renders, and Marco Saerens. Random-walk compu-\ntation of similarities between nodes of a graph with application to collaborative recommendation.\nIEEE Transactions on Knowledge and Data Engineering, 19(3):355\u2013369, 2007.\n\n[19] Arpita Ghosh, Stephen Boyd, and Amin Saberi. Minimizing effective resistance of a graph.\n\nSIAM Review, 50(1):37\u201366, 2008.\n\n10\n\n\f[20] Leo Grady. Random walks for image segmentation. IEEE Transactions on Pattern Analysis\n\nand Machine Intelligence, 28(11):1768\u20131783, 2006.\n\n[21] Michael Grant and Stephen Boyd. CVX: Matlab software for disciplined convex programming,\n\nVersion 2.1, 2014.\n\n[22] Taher H. Haveliwala. Topic-sensitive PageRank: A context-sensitive ranking algorithm for web\n\nsearch. IEEE Transactions on Knowledge and Data Engineering, 2003.\n\n[23] Glen Jeh and Jennifer Widom. Simrank: a measure of structural-context similarity. In Proceed-\nings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data\nMining, pages 538\u2013543. ACM, 2002.\n\n[24] Glen Jeh and Jennifer Widom. Scaling personalized web search. In Proceedings of the 12th\n\nInternational Conference on World Wide Web, pages 271\u2013279. ACM, 2003.\n\n[25] Vassilis Kalofolias. How to learn a graph from smooth signals. In The 19th International Con-\nference on Arti\ufb01cial Intelligence and Statistics (AISTATS 2016). Journal of Machine Learning\nResearch (JMLR), 2016.\n\n[26] Sampath Kannan, Claire Mathieu, and Hang Zhou. Near-linear query complexity for graph\ninference. In International Colloquium on Automata, Languages, and Programming. Springer,\n2015.\n\n[27] Leo Katz. A new status index derived from sociometric analysis. Psychometrika, 18(1):39\u201343,\n\n1953.\n\n[28] Liran Katzir, Edo Liberty, and Oren Somekh. Estimating sizes of social networks via biased\nsampling. In Proceedings of the 20th International Conference on World Wide Web. ACM,\n2011.\n\n[29] Douglas J. Klein and Milan Randi\u00b4c. Resistance distance. Journal of Mathematical Chemistry,\n\n12(1):81\u201395, 1993.\n\n[30] Jon Kleinberg, Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, and Andrew Tomkins.\nThe web as a graph: measurements, models, and methods. Computing and Combinatorics,\npages 1\u201317, 1999.\n\n[31] Aleksandra Korolova, Rajeev Motwani, Shubha U Nabar, and Ying Xu. Link privacy in\nsocial networks. In Proceedings of the 17th ACM Conference on Information and Knowledge\nManagement, pages 289\u2013298. ACM, 2008.\n\n[32] Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. Predicting positive and negative links\nin online social networks. In Proceedings of the 19th International Conference on World Wide\nWeb (WWW). ACM, 2010.\n\n[33] Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection.\n\nhttp://snap.stanford.edu/data, June 2014.\n\n[34] Jure Leskovec and Julian J Mcauley. Learning to discover social circles in ego networks. In\n\nAdvances in Neural Information Processing Systems, pages 539\u2013547, 2012.\n\n[35] David Liben-Nowell and Jon Kleinberg. The link-prediction problem for social networks.\n\nJournal of the Association for Information Science and Technology, 58(7):1019\u20131031, 2007.\n\n[36] Brad H McRae. Isolation by resistance. Evolution, 60(8):1551\u20131561, 2006.\n\n[37] Brad H McRae, Brett G Dickson, Timothy H Keitt, and Viral B Shah. Using circuit theory to\nmodel connectivity in ecology, evolution, and conservation. Ecology, 89(10):2712\u20132724, 2008.\n\n[38] MOSEK ApS. The MOSEK Optimization Suite, 2017.\n\n[39] Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The PageRank citation\n\nranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.\n\n11\n\n\f[40] Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. DeepWalk: Online learning of social\nrepresentations. In Proceedings of the 20th ACM International Conference on Knowledge\nDiscovery and Data Mining, 2014.\n\n[41] David Pilliod, Robert Arkle, Jeanne Robertson, Melanie Murphy, and W. Chris Funk. Effects of\nchanging climate on aquatic habitat and connectivity for remnant populations of a wide-ranging\nfrog species in an arid landscape. 5, 08 2015.\n\n[42] Matthew J Rattigan and David Jensen. The case for anomalous link discovery. ACM SIGKDD\n\nExplorations Newsletter, 7(2):41\u201347, 2005.\n\n[43] Lev Reyzin and Nikhil Srivastava. On the longest path algorithm for reconstructing trees from\n\ndistance matrices. Information Processing Letters, 101(3):98\u2013100, 2007.\n\n[44] Marco Saerens, Francois Fouss, Luh Yen, and Pierre Dupont. The principal components analysis\nof a graph, and its relationships to spectral clustering. In ECML, volume 3201, pages 371\u2013383,\n2004.\n\n[45] Purnamrita Sarkar and Andrew Moore. A tractable approach to \ufb01nding closest truncated-\ncommute-time neighbors in large graphs. In Proceedings of the Twenty-Third Conference\nAnnual Conference on Uncertainty in Arti\ufb01cial Intelligence (UAI-07), pages 335\u2013343, 2007.\n\n[46] Daniel A. Spielman. Trees and distances. University Lecture, 2012.\n[47] Daniel A. Spielman and Nikhil Srivastava. Graph sparsi\ufb01cation by effective resistances. SIAM\n\nJournal on Computing, 40(6):1913\u20131926, 2011.\n\n[48] Eric A. Stone and Alexander R. Grif\ufb01ng. On the Fiedler vectors of graphs that arise from trees\nby Schur complementation of the Laplacian. Linear Algebra and its Applications, 431(10):1869\n\u2013 1880, 2009.\n\n[49] Jun Sun, Stephen Boyd, Lin Xiao, and Persi Diaconis. The fastest mixing Markov process on a\ngraph and a connection to a maximum variance unfolding problem. SIAM Review, 48(4):681\u2013\n699, 2006.\n\n[50] Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. Fast random walk with restart and its\n\napplications. 2006.\n\n[51] Charalampos E Tsourakakis, Michael Mitzenmacher, Jaros\u0142aw B\u0142asiok, Ben Lawson, Preetum\nNakkiran, and Vasileios Nakos. Predicting positive and negative links with noisy queries:\nTheory & practice. arXiv preprint arXiv:1709.07308, 2017.\n\n[52] Ulrike Von Luxburg, Agnes Radl, and Matthias Hein. Getting lost in space: Large sample\nanalysis of the resistance distance. In Advances in Neural Information Processing Systems,\n2010.\n\n[53] Ulrike Von Luxburg, Agnes Radl, and Matthias Hein. Hitting and commute times in large\nrandom neighborhood graphs. Journal of Machine Learning Research, 15(1):1751\u20131798, 2014.\n[54] Dominik M. Wittmann, Daniel Schmidl, Florian Bl\u00f6chl, and Fabian J. Theis. Reconstruction of\n\ngraphs based on random walks. Theoretical Computer Science, 2009.\n\n[55] Luh Yen, Francois Fouss, Christine Decaestecker, Pascal Francq, and Marco Saerens. Graph\nnodes clustering based on the commute-time kernel. Advances in Knowledge Discovery and\nData Mining, 2007.\n\n[56] Elena Zheleva, Evimaria Terzi, and Lise Getoor. Privacy in social networks. Synthesis Lectures\n\non Data Mining and Knowledge Discovery, 3(1):1\u201385, 2012.\n\n12\n\n\f", "award": [], "sourceid": 1872, "authors": [{"given_name": "Jeremy", "family_name": "Hoskins", "institution": "Yale University"}, {"given_name": "Cameron", "family_name": "Musco", "institution": "Massachusetts Institute of Technology"}, {"given_name": "Christopher", "family_name": "Musco", "institution": "Mass. Institute of Technology"}, {"given_name": "Babis", "family_name": "Tsourakakis", "institution": "Boston University"}]}