{"title": "Denoising and Untangling Graphs Using Degree Priors", "book": "Advances in Neural Information Processing Systems", "page_first": 385, "page_last": 392, "abstract": "", "full_text": "Denoising and untangling graphs using\n\ndegree priors\n\nQuaid D Morris, Brendan J Frey, and Christopher J Paige\n\nUniversity of Toronto\n\nElectrical and Computer Engineering\n\n10 King\u2019s College Road, Toronto, Ontario, M5S 3G4\n\nCanada\n\nfquaid, freyg@psi.utoronto.ca, paige@uhnres.utoronto.ca\n\nAbstract\n\nThis paper addresses the problem of untangling hidden graphs from\na set of noisy detections of undirected edges. We present a model\nof the generation of the observed graph that includes degree-based\nstructure priors on the hidden graphs. Exact inference in the model\nis intractable; we present an e\u2013cient approximate inference algo-\nrithm to compute edge appearance posteriors. We evaluate our\nmodel and algorithm on a biological graph inference problem.\n\n1\n\nIntroduction and motivation\n\nThe inference of hidden graphs from noisy edge appearance data is an important\nproblem with obvious practical application. For example, biologists are currently\nbuilding networks of all the physical protein-protein interactions (PPI) that occur\nin particular organisms. The importance of this enterprise is commensurate with its\nscale: a completed network would be as valuable as a completed genome sequence,\nand because each organism contains thousands of di\ufb01erent types of proteins, there\nare millions of possible types of interactions. However, scalable experimental meth-\nods for detecting interactions are noisy, generating many false detections. Motivated\nby this application, we formulate the general problem of inferring hidden graphs as\nprobabilistic inference in a graphical model, and we introduce an e\u2013cient algorithm\nthat approximates the posterior probability that an edge is present.\n\nIn our model, a set of hidden, constituent graphs are combined to generate the ob-\nserved graph. Each hidden graph is independently sampled from a prior on graph\nstructure. The combination mechanism acts independently on each edge but can\nbe either stochastic or deterministic. Figure 1 shows an example of our generative\nmodel. Typically one of the hidden graphs represents the graph of interest (the true\ngraph), the others represent di\ufb01erent types of observation noise. Independent edge\nnoise may also be added by the combination mechanism. We use probabilistic in-\nference to compute a likely decomposition of the observed graph into its constituent\nparts. This process is deemed \\untangling\". We use the term \\denoising\" to refer\nto the special case where the edge noise is independent.\nIn denoising there is a\nsingle hidden graph, the true graph, and all edge noise in the observed graph is due\n\n\f1E\n\n2E\n\n1\nije\n\ni\n\nj\n\n2\nije\n\ni\n\nj\n\nijx\n\ni\n\nj\n\nX\n\nFigure 1: Illustrative generative model example. Figure shows an example where an observed\ngraph, X, is a noisy composition of two constituent graphs, E 1 and E2. All graphs share the\nsame vertex set, so each can be represented by a symmetric matrix of random binary variables\n(i.e., an adjacency matrix). This generative model is designed to solve a toy counter-espionage\nproblem. The vertices represent suspects and each edge in X represents an observed call\nbetween two suspects. The graph X re(cid:176)ects zero or more spy rings (represented by E 1),\ntelemarketing calls (represented by E 2), social calls (independent edge noise), and lost call\nrecords (more independent edge noise). The task is to locate any spy rings hidden in X. We\nmodel the distribution of spy ring graphs using a prior, P (E 1), that has support only on graphs\nwhere all vertices have degree of either 2 (i.e., are in the ring) or 0 (i.e., are not). Graphs of\ntelemarketing call patterns are represented using a prior, P (E 2), under which all nodes have\ndegrees of > 3 (i.e., are telemarketers), 1 (i.e., are telemarketees), or 0 (i.e., are neither). The\ndisplayed hidden graphs are one likely untangling of X.\n\nto the combination mechanism.\n\nPrior distributions over graphs can be speci\ufb02ed in various ways, but our choice is\nmotivated by problems we want to solve, and by a view to deriving an e\u2013cient infer-\nence algorithm. One compact representation of a distribution over graphs consists\nof specifying a distribution over vertex degrees, and assuming that graphs that have\nthe same vertex degrees are equiprobable. Such a prior can model quite rich distri-\nbutions over graphs. These degree-based structure priors are natural representions\nof graph structure; many classes of real-world networks have a characteristic func-\ntional form associated with their degree distributions [1], and sometimes this form\ncan be predicted using knowledge about the domain (see, e.g., [2]) or detected em-\npirically (see, e.g., [3, 4]). As such, our model incorporates degree-based structure\npriors.\n\nThough exact inference in our model is intractable in general, we present an e\u2013cient\nalgorithm for approximate inference for arbitrary degree distributions. We evaluate\nour model and algorithm using the real-world example of untangling yeast protein-\nprotein interaction networks.\n\n\f2 A model of noisy and tangled graphs\n\nFor degree-based structure priors, inference consists of searching over vertex degrees\nand edge instantiations, while comparing each edge with its noisy observation and\nenforcing the constraint that the number of edges connected to every vertex must\nequal the degree of the vertex. Our formulation of the problem in this way is in-\nspired by the success of the sum-product algorithm (loopy belief propagation) for\nsolving similar formulations of problems in error-correcting decoding [6, 7], phase\nunwrapping [8], and random satis\ufb02ability [9]. For example, in error-correcting de-\ncoding, inference consists of searching over con\ufb02gurations of codeword bits, while\ncomparing each bit with its noisy observation and enforcing parity-check constraints\non subsets of bits [10].\n\nFor a graph on a set of N vertices, eij is a variable that indicates the presence\nof an edge connecting vertices i and j: eij = 1 if there is an edge, and eij = 0\notherwise. We assume the vertex set is \ufb02xed, so each graph is speci\ufb02ed by an\nadjacency matrix, E = feijgN\ni;j=1. The degree of vertex i is denoted by di and the\ndegree set by D = fdigN\ni=1. The observations are given by a noisy adjacency matrix,\nX = fxijgN\ni;j=1. Generally, edges can be directed, but in this paper we focus on\nundirected graphs, so eij = eji and xij = xji.\n\nAssuming the observation noise is independent for di\ufb01erent edges, the joint distri-\nbution is\n\nP (X; E; D) = P (XjE)P (E; D) =\u2021Yj\u201ai\n\nP (xijjeij)\u00b7P (E; D):\n\nP (xijjeij) models the edge observation noise. We use an undirected model for the\njoint distribution over edges and degrees, P (E; D), where the prior distribution over\ndi is determined by a non-negative potential fi(di). Assuming graphs that have the\nsame vertex degrees are equiprobable, we have\n\nP (E; D) /Yi \u2021fi(di)I(di;\n\nN\n\nXj=1\n\neij)\u00b7;\n\nensures that the number of edges connected to vertex i is equal to di.\nis straightforward to show that the marginal distribution over di\n\nwhere I(a; b) = 1 if a = b, and I(a; b) = 0 if a 6= b. The term I(di;Pj eij)\nfi(di)PDndi\u00a1nDQj6=i fj(dj)\u00a2, where nD is the number of graphs with degrees D\n\nIt\nis P (di) /\n\nand the sum is over all degree variables except di. The potentials, fi, can be\nestimated from a given degree prior using Markov chain Monte Carlo; or, as an\napproximation, they can be set to an empirical degree distribution obtained from\nnoise-free graphs.\n\nFig 2a shows the factor graph [11] for the above model. Each \ufb02lled square cor-\nresponds to a term in the factorization of the joint distribution and the square is\nconnected to all variables on which the term depends. Factor graphs are graphical\nmodels that unify the properties of Bayesian networks and Markov random \ufb02elds\n[12]. Many inference algorithms, including the sum-product algorithm (a.k.a. loopy\nbelief propagation), are more easily derived using factor graphs than Bayesian net-\nworks or Markov random \ufb02elds. We describe the sum-product algorithm for our\nmodel in section 3.\n\n\fI( , + + + )\nd4 e14 e24 e34 e44\n\n(b)\n\nd1\n\nd2\n\nd44f ( )\n\nd4\n\nd3\n\n1\nd\n1\n\nd1\n2\n\n1\nd\n3\n\n1\nd\n4\n\ne11\n\ne12\n\ne13\n\ne14\n\ne22\n\ne23\n\ne24\n\ne33\n\ne34\n\ne44\n\nx11\n\nx12\n\nx13\n\nx14\n\nx22\n\nx23\n\nx24\n\nx33\n\nx34\nP( | )\n\nx44\ne44\n\nx44\n\n1\ne\n11\n\n1\ne\n12\n\n1\ne\n13\n\n1\ne\n14\n\n1\ne\n22\n\n1\ne\n23\n\n1\ne\n24\n\n1\ne\n33\n\n1\ne\n34\n\n1\ne\n44\n\nx11\n\nx12\n\nx13\n\nx14\n\nx22\n\nx23\n\nx24\n\nx33\n\nx34\n\nx44\n\n2\ne\n11\n\n2\ne\n12\n\n2\ne\n13\n\n2\ne\n14\n\n2\ne\n22\n\n2\ne\n23\n\n2\ne\n24\n\n2\ne\n33\n\n2\ne\n34\n\n2\ne\n44\n\n(a)\n\n(c)\n\nd4\n\nd4\n\n2\nd\n1\n\nd2\n2\n\n2\nd\n3\n\n2\nd\n4\n\ne14\n\ne24\n\ne34\n\ne44\n\ns41\n\ns42\n\ns43\n\ns44\n\ne14\n\ne24\n\ne34\n\ne44\n\n1\n2\nP( | , )\ne44 e44\n\nx44\n\nFigure 2: (a) A factor graph that describes a distribution over graphs with vertex degrees\ndi, binary edge indicator variables eij, and noisy edge observations xij. The indicator function\n\nvertex i must equal the degree of vertex i. (b) A factor graph that explains noisy observed\nedges as a combination of two constituent graphs, with edge indicator variables e1\nij and e2\nij.\n\nI(di;Pj eij) enforces the constraint that the sum of the binary edge indicator variables for\n(c) The constraint I(di;Pj eij) can be implemented using a chain with state variables, which\n\nleads to an exponentially faster message-passing algorithm.\n\n2.1 Combining multiple graphs\n\nThe above model is suitable when we want to infer a graph that matches a degree\nprior, assuming the edge observation noise is independent. A more challenging\ngoal, with practical application, is to infer multiple hidden graphs that combine to\nexplain the observed edge data.\nIn section 4, we show how priors over multiple\nhidden graphs can be be used to infer protein-protein interactions.\n\nWhen there are H hidden graphs, each constituent graph is speci\ufb02ed by a set of\nedges on the same set of N common vertices. For the degree variables and edge\nvariables, we use a superscript to indicate which hidden graph the variable is used\nto describe. Assuming the graphs are independent, the joint distribution over the\nobserved edge data X, and the edge variables and degree variables for the hidden\ngraphs, E1; D1; : : : ; EH ; DH , is\n\nP (X; E1; D1; : : : ; EH ; DH ) =\u2021Yj\u201ai\n\nP (xijje1\n\nij; : : : ; eH\n\nij )\u00b7\n\nP (Eh; Dh);\n\n(1)\n\nH\n\nYh=1\n\nwhere for each hidden graph, P (E h; Dh) is modeled as described above. Here, the\nlikelihood P (xijje1\nij ) describes how the edges in the hidden graphs combine\nto model the observed edge. Figure 2b shows the factor graph for this model.\n\nij; : : : ; eH\n\n3 Probabilistic inference of constituent graphs\n\nExact probabilistic inference in the above models is intractable, here we introduce\nan approximate inference algorithm that consists of applying the sum-product al-\ngorithm, while ignoring cycles in the factor graph. Although the sum-product algo-\nrithm has been used to obtain excellent results on several problems [6, 7, 13, 14, 8, 9],\nwe have found that the algorithm works best when the model consists of uncertain\nobservations of variables that are subject to a large number of hard constraints.\nThus the formulation of the model described above.\n\n\fConceptually, our inference algorithm is a straight-forward application of the sum-\nproduct algorithm, c.f.\n[15], where messages are passed along edges in the factor\ngraph iteratively, and then combined at variables to obtain estimates of posterior\nprobabilities. However, direct implementation of the message-passing updates will\nlead to an intractable algorithm. In particular, direct implementation of the update\n\nfor the message sent from function I(di;Pj eij) to edge variable eik takes a number\n\nof scalar operations that is exponential in the number of vertices. Fortunately there\nexists a more e\u2013cient way to compute these messages.\n\n3.1 E\u2013ciently summing over edge con\ufb02gurations\n\nis equal to di. Passing messages through this function requires summing over all\nedge con\ufb02gurations that correspond to each possible degree, di, and summing over\n\nThe function I(di;Pj eij) ensures that the number of edges connected to vertex i\ndi. Speci\ufb02cally, the message, \u201eIi!eik (eik), sent from function I(di;Pj eij) to edge\n\nvariable eik is given by\n\nXdi\n\nXfeij j j=1;:::;N; j6=kg\u2021I(di;Xj\n\neij)Yj6=k\n\n\u201eeij !Ii (eij)\u00b7;\n\nThe sum over feijj j = 1; : : : ; N; j 6= kg contains 2N \u00a11 terms, so direct computation\nis intractable. However, for a maximum degree of dmax, all messages departing\n\nwhere \u201eeij !Ii (eij) is the message sent from eij to function I(di;Pj eij).\nfrom the function I(di;Pj eij) can be computed using order dmaxN binary scalar\noperations, by introducing integer state variables sij. We de\ufb02ne sij = Pn\u2022j ein\n\nand note that, by recursion, sij = sij\u00a11 + eij, where si0 = 0 and 0 \u2022 sij \u2022 dmax.\nThis recursive expression enables us to write the high-complexity constraint as the\nsum of a product of low-complexity constraints,\nN\n\nI(di;Xj\n\neij) = Xfsij j j=1;:::;N g\n\nI(si1; ei1)\u2021\n\nYj=2\n\nI(sij; sij\u00a11 + eij)\u00b7I(di; siN ):\n\nThis summation can be performed using the forward-backward algorithm.\nIn\nthe factor graph, the summation can be implemented by replacing the function\n\n2c. The function vertex (\ufb02lled square) on the far left corresponds to I(si1; ei1) and\nthe function vertex in the upper right corresponds to I(di; siN ). So, messages can\n\nI(di;Pj eij) with a chain of lower-complexity functions, connected as shown in Fig.\nbe passed through each constraint function I(di;Pj eij) in an e\u2013cient manner, by\n\nperforming a single forward-backward pass in the corresponding chain.\n\n4 Results\n\nWe evaluate our model using yeast protein-protein interaction (PPI) data compiled\nby [16]. These data include eight sets of putative, but noisy, interactions derived\nfrom various sources, and one gold-standard set of interactions detected by reliable\nexperiments.\n\nUsing the \u00bb 6300 yeast proteins as vertices, we represent the eight sets of putative\ninteractions using adjacency matrices fY mg8\nij = 1 if and only if putative\ninteraction dataset m contains an interaction between proteins i and j. We similarly\nuse Y gold to represent the gold-standard interactions.\n\nm=1 where ym\n\nWe construct an observed graph, X, by setting xij = maxm ym\nij for all i and j, thus\nthe observed edge set is the union of all the putative edge sets. We test our model\n\n\f)\n\n%\n\n(\n \ns\ne\nv\n\ni\nt\ni\ns\no\np\ne\nu\nr\nt\n\n \n\n50\n\n40\n\n30\n\n20\n\n10\n\n0\n0\n\n(a)\n\nuntangling\nbaseline\nrandom\n\n(b)\n\nempirical\npotential\nposterior\n\n0\n\n\u22122\n\n\u22124\n\n\u22126\n\n\u22128\n\n \n\nr\nP\ng\no\n\nl\n\n5\n\nfalse positives (%)\n\n10\n\n\u221210\n0\n\n10\n\n20\n\ndegree (# of nodes)\n\n30\n\nFigure 3: Protein-protein interaction network untangling results. (a) ROC curves measuring\nperformance of predicting e1\nij when xij = 1. (b) Degree distributions. Compares the empirical\ndegree distribution of the test set subgraph of E 1 to the degree potential f 1 estimated on the\nij = 1jX)\n\ntraining set subgraph of E 1 and to the distribution of di =Pj pij where pij = ^P (e1\n\nis estimated by untangling.\n\non the task of discerning which of the edges in X are also in Y gold. We formalize\nthis problem as that of decomposing X into two constituent graphs E 1 and E2, the\ntrue and the noise graphs respectively, such that e1\nij = xij \u00a1 e1\nij.\n\nij = xijygold\n\nand e2\n\nij\n\nWe use a training set to \ufb02t our model parameters and then measure task perfor-\nmance on a test set. The training set contains a randomly selected half of the\n\u00bb 6300 yeast proteins, and the subgraphs of E 1, E2, and X restricted to those\nvertices. The test contains the other half of the proteins and the corresponding\nsubgraphs. Note that interactions connecting test set proteins to training set pro-\nteins (and vice versa) are ignored.\n\nWe \ufb02t three sets of parameters: a set of Naive Bayes parameters that de\ufb02ne a set of\nedge-speci\ufb02c likelihood functions, Pij(xijje1\nij), one degree potential, f 1, which\nis the same for every vertex in E1 and de\ufb02nes the prior P (E 1), and a second, f 2,\nthat similarly de\ufb02nes the prior P (E 2).\n\nij; e2\n\nij = 1 \u00a1 e2\n\nThe likelihood functions, Pij, are used to both assign likelihoods and enforce prob-\nlem constraints. Given our problem de\ufb02nition, if xij = 0 then e1\nij = 0,\notherwise xij = 1 and e1\nij. We enforce the former constraint by set-\nting Pij(xij = 0je1\nij), and the latter by setting Pij(xij =\n1je1\nij. This construction of Pij simpli\ufb02es the calculation\nmessages and improves the computational e\u2013ciency of inference be-\nof the \u201ePij !eh\ncause when xij = 0, we need never update messages to and from variables e1\nij and\nij. We complete the speci\ufb02cation of Pij(xij = 1je1\ne2\n\nij) = 0 whenever e1\n\nij) = (1 \u00a1 e1\n\nij) as follows:\n\nij)(1 \u00a1 e2\n\nij = e2\n\nij = e2\n\nij; e2\n\nij; e2\n\nij; e2\n\nij\n\nPij(xij = 1je1\n\nij; e2\n\nij) =((cid:181)\n\nym\nm (1 \u00a1 (cid:181)m)1\u00a1ym\nij\nym\nm (1 \u00a1 \u02c6m)1\u00a1ym\nij\n\u02c6\n\nij ; if e1\nij ; if e1\n\nij = 1 and e2\nij = 0 and e2\n\nwhere f(cid:181)mg and f\u02c6mg are naive Bayes parameters, (cid:181)m = Pi;j ym\n\nij=Pi;j e1\n\nij and\n\nij = 0;\nij = 1:\nij e1\n\n\f\u02c6m =Pi;j ym\n\nij e2\n\nij=Pi;j e2\n\nij, respectively.\n\nThe degree potentials f 1(d) and f 2(d) are kernel density estimates \ufb02t to the degree\ndistribution of the training set subgraphs of E 1 and E2, respectively. We use\nGaussian kernels and set the width parameter (standard deviation) (cid:190) using leave-\none-out cross-validation to maximize the total log density of the held-out datapoints.\nEach datapoint is the degree of a single vertex. Both degree potentials closely\nfollowed the training set empirical degree distributions.\n\nUntangling was done on the test set subgraph of X. We initially set the \u201ePij!e1\nmessages equal to the likelihood function Pij and we randomly initialized the\n\u201eI 1\nmessages with samples from a normal distribution with mean 0 and vari-\nance 0:01. We then performed 40 iterations of the following message update order:\n\u201ee1\n\nj !e1\nij\n\n; \u201eI 1\n\n; \u201ee2\n\nij !Pij ; \u201ePij !e1\n\nij !I 2\nj\n\nij !I 1\nj\n\n; \u201ee1\n\nij !Pij ; \u201ePij !e2\n\nij\n\n; \u201ee2\n\n; \u201eI 2\n\nj !e2\nij\n\nj !e1\nij\n\nij\n\n.\n\nij\n\nWe evaluated our untangling algorithm using an ROC curve by comparing the actual\ntest set subgraph of E 1 to posterior marginal probabilities, ^P (e1\nij = 1jX), estimated\nby our sum-product algorithm. Note that because the true interaction network is\nsparse (less than 0:2% of the 1:8 \u00a3 107 possible interactions are likely present [16])\nand, in this case, true positive predictions are of greater biological interest than\ntrue negative predictions, we focus on low false positive rate portions of the ROC\ncurve.\n\nFigure 3a compares the performance of a classi\ufb02er for e1\nij based on thresholding\n^P (eij = 1jX) to a baseline method based on thresholding the likelihood functions,\nPij(xij = 1je1\nij = 0 whenever xij = 0, we exclude\nthe xij = 0 cases from our performance evaluation. The ROC curve shows that\nfor the same low false positive rate, untangling produces 50% \u00a1 100% more true\npositives than the baseline method.\n\nij = 0). Note because e1\n\nij = 1; e2\n\nFigure 3b shows that the degree potential, the true degree distribution, and the\npredicted degree distribution are all comparable. The slight overprediction of the\ntrue degree distribution may result because the degree potential f 1 that de\ufb02nes\nP (E1) is not equal to the expected degree distribution of graphs sampled from the\ndistribution P (E1).\n\n5 Summary and Related Work\n\nRelated work includes other algorithms for structure-based graph denoising [17, 18].\nThese algorithms use structural properties of the observed graph to score edges and\nrely on the true graph having a surprisingly large number of three (or four) edge\ncycles compared to the noise graph. In contrast, we place graph generation in a\nprobabilistic framework; our algorithm computes structural \ufb02t in the hidden graph,\nwhere this computation is not a\ufb01ected by the noise graph(s); and we allow for\nmultiple sources of observation noise, each with its own structural properties.\n\nAfter submitting this paper to the NIPS conference, we discovered [19], in which a\ndegree-based graph structure prior is used to denoise (but not untangle) observed\ngraphs. This paper addresses denoising in directed graphs as well as undirected\ngraphs, however, the prior that they use is not amenable to deriving an e\u2013cient sum-\nproduct algorithm. Instead, they use Markov Chain Monte Carlo to do approximate\ninference in a hidden graph containing 40 vertices.\nIt is not clear how well this\napproach scales to the \u00bb 3000 vertex graphs that we are using.\n\nIn summary, the contributions of the work described in this paper include: a general\n\n\fformulation of the problem of graph untangling as inference in a factor graph; an\ne\u2013cient approximate inference algorithm for a rich class of degree-based structure\npriors; and a set of reliability scores (i.e., edge posteriors) for interactions from a\ncurrent version of the yeast protein-protein interaction network.\n\nReferences\n\n[1] A L Barabasi and R Albert. Emergence of scaling in random networks. Science,\n\n286(5439), October 1999.\n\n[2] A Rzhetsky and S M Gomez. Birth of scale-free molecular networks and the number\nof distinct dna and protein domains per genome. Bioinformatics, pages 988{96, 2001.\n\n[3] M Faloutsos, P Faloutsos, and C Faloutsos. On power-law relationships of the Internet\n\ntopology. Computer Communications Review, 29, 1999.\n\n[4] Hawoong Jeong, B Tombor, R\u00b6eka Albert, Z N Oltvai, and Albert-L\u00b6aszl\u00b6o Barab\u00b6asi.\n\nThe large-scale organization of metabolic networks. Nature, 407, October 2000.\n\n[5] J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San\n\nMateo CA., 1988.\n\n[6] D. J. C. MacKay and R. M. Neal. Near Shannon limit performance of low density\nparity check codes. Electronics Letters, 32(18):1645{1646, August 1996. Reprinted in\nElectronics Letters, vol. 33, March 1997, 457{458.\n\n[7] B. J. Frey and F. R. Kschischang. Probability propagation and iterative decoding. In\nProceedings of the 1996 Allerton Conference on Communication, Control and Com-\nputing, 1996.\n\n[8] B. J. Frey, R. Koetter, and N. Petrovic. Very loopy belief propagation for unwrapping\nIn 2001 Conference on Advances in Neural Information Processing\n\nphase images.\nSystems, Volume 14. MIT Press, 2002.\n\n[9] M. M\u00b6ezard, G. Parisi, and R. Zecchina. Analytic and algorithmic solution of random\n\nsatis\ufb02ability problems. Science, 297:812{815, 2002.\n\n[10] B. J. Frey and D. J. C. MacKay. Trellis-constrained codes. In Proceedings of the 35th\n\nAllerton Conference on Communication, Control and Computing 1997, 1998.\n\n[11] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger. Factor graphs and the sum-product\nIEEE Transactions on Information Theory, Special Issue on Codes on\n\nalgorithm.\nGraphs and Iterative Algorithms, 47(2):498{519, February 2001.\n\n[12] B. J. Frey. Factor graphs: A uni\ufb02cation of directed and undirected graphical models.\n\nUniversity of Toronto Technical Report PSI-2003-02, 2003.\n\n[13] Kevin P. Murphy, Yair Weiss, and Michael I. Jordan. Loopy belief propagation for\napproximate inference: An empirical study. In Uncertainty in Arti\ufb02cial Intelligence\n1999. Stockholm, Sweden, 1999.\n\n[14] W. Freeman and E. Pasztor. Learning low-level vision. In Proceedings of the Inter-\n\nnational Conference on Computer Vision, pages 1182{1189, 1999.\n\n[15] M. I. Jordan. An Inroduction to Learning in Graphical Models. 2004. In preparation.\n\n[16] C von Mering et al. Comparative assessment of large-scale data sets of protein-protein\n\ninteractions. Nature, 2002.\n\n[17] R Saito, H Suzuki, and Y Hayashizaki. Construction of reliable protein-protein in-\nteraction networks with a new interaction generality measure. Bioinformatics, pages\n756{63, 2003.\n\n[18] D S Goldberg and F P Roth. Assessing experimentally derived interactions in a small\n\nworld. Proceedings of the National Academy of Science, 2003.\n\n[19] S M Gomez and A Rzhetsky. Towards the prediction of complete protein{protein\n\ninteraction networks. In Paci\ufb02c Symposium on Biocomputing, pages 413{24, 2002.\n\n\f", "award": [], "sourceid": 2354, "authors": [{"given_name": "Quaid", "family_name": "Morris", "institution": null}, {"given_name": "Brendan", "family_name": "Frey", "institution": null}]}