{"title": "Half-Lives of EigenFlows for Spectral Clustering", "book": "Advances in Neural Information Processing Systems", "page_first": 705, "page_last": 712, "abstract": null, "full_text": "Half-Lives of EigenFlows for Spectral Clustering\n\nChakra Chennubhotla & Allan D. Jepson\n\nDepartment of Computer Science, University of Toronto, Canada M5S 3H5\n\n chakra,jepson\n\n@cs.toronto.edu\n\nAbstract\n\nUsing a Markov chain perspective of spectral clustering we present an\nalgorithm to automatically \ufb01nd the number of stable clusters in a dataset.\nThe Markov chain\u2019s behaviour is characterized by the spectral properties\nof the matrix of transition probabilities, from which we derive eigen\ufb02ows\nalong with their hal\ufb02ives. An eigen\ufb02ow describes the \ufb02ow of probabil-\nity mass due to the Markov chain, and it is characterized by its eigen-\nvalue, or equivalently, by the hal\ufb02ife of its decay as the Markov chain\nis iterated. A ideal stable cluster is one with zero eigen\ufb02ow and in\ufb01-\nnite half-life. The key insight in this paper is that bottlenecks between\nweakly coupled clusters can be identi\ufb01ed by computing the sensitivity\nof the eigen\ufb02ow\u2019s hal\ufb02ife to variations in the edge weights. We propose\na novel EIGENCUTS algorithm to perform clustering that removes these\nidenti\ufb01ed bottlenecks in an iterative fashion.\n\n1 Introduction\nWe consider partitioning a weighted undirected graph\u2014 corresponding to a given dataset\u2014\ninto a set of discrete clusters. Ideally, the vertices (i.e. datapoints) in each cluster should\nbe connected with high-af\ufb01nity edges, while different clusters are either not connected or\nare connnected only by a few edges with low af\ufb01nity. The practical problem is to identify\nthese tightly coupled clusters, and cut the inter-cluster edges.\n\nMany techniques have been proposed for this problem, with some recent success being ob-\ntained through the use of spectral methods (see, for example, [2, 4, 5, 11, 12]). Here we use\nthe random walk formulation of [4], where the edge weights are used to construct a Markov\nde\ufb01nes a random walk on the graph to\ntransition probability matrix,\nprovide the basis for deciding on a\nbe partitioned. The eigenvalues and eigenvectors of\nparticular segmentation. In particular, it has been shown that for\nweakly coupled clus-\nters, the leading\nwill be roughly piecewise constant [4, 13, 5]. This\nresult motivates many of the current spectral clustering algorithms. For example in [5], the\nnumber of clusters\n-means algorithm is used on the\nin an attempt to identify the appropriate piecewise constant\n\nmust be known a priori, and the\n\neigenvectors of\n\n. This matrix\n\nleading eigenvectors of\n\nregions.\n\nIn this paper we investigate the form of the leading eigenvectors of the Markov matrix\n.\nUsing some simple image segmentation examples we con\ufb01rm that the leading eigenvectors\nof\nare roughly piecewise constant for problems with well separated clusters. However,\nwe observe that for several segmentation problems that we might wish to solve, the cou-\npling between the clusters is signi\ufb01cantly stronger and, as a result, the piecewise constant\napproximation breaks down.\n\n\u0001\n\u0002\n\u0002\n\u0002\n\u0003\n\u0003\n\u0002\n\u0003\n\u0003\n\u0003\n\u0002\n\u0002\n\u0002\n\fUnlike the piecewise constant approximation, a perfectly general view is that the eigenvec-\ndetermine particular \ufb02ows of probability along the edges in the graph. We refer\ntors of\nto these as eigen\ufb02ows since they are characterized by their associated eigenvalue\n, which\nspeci\ufb01es the \ufb02ow\u2019s overall rate of decay. Instead of measuring the decay rate in terms of\nthe eigenvalue\n, which is simply\nis the number of Markov chain steps needed to reduce\nde\ufb01ned by\n\u0003\u0002\nthe particular eigen\ufb02ow to half its initial value. Note that as\nthe half-life\napproaches in\ufb01nity.\n\n, we \ufb01nd it more convenient to use the \ufb02ow\u2019s hal\ufb02ife\n\u0004\u0006\u0005\b\u0007\n\t\f\u000b\n\napproaches\n\n. Here\n\nFrom the perspective of eigen\ufb02ows, a graph representing a set of weakly coupled clus-\nters produces eigen\ufb02ows between the various clusters which decay with long hal\ufb02ives. In\ncontrast, the eigen\ufb02ows within each cluster decay much more rapidly. In order to iden-\ntify clusters we therefore consider the eigen\ufb02ows with long hal\ufb02ives. Given such a slowly\ndecaying eigen\ufb02ow, we identify particular bottleneck regions in the graph which critically\nrestrict the \ufb02ow (cf. [12]). To identify these bottlenecks we propose computing the sensi-\ntivity of the \ufb02ow\u2019s hal\ufb02ife with respect to perturbations in the edge weights.\n\nand\n\nto vertex\n\n\u000e\u0014\u000f\n\u001e\u0010\u000f\u001f\u001b\n\n\u000f10\n\nmatrix\n\n\u000f10\n\nWe implement a simple spectral graph partitioning algorithm which is based on these ideas.\nWe \ufb01rst compute the eigenvectors for the Markov transition matrix, and select those with\nlong hal\ufb02ives. For each such eigenvector, we identify bottlenecks by computing the sensi-\ntivity of the \ufb02ow\u2019s hal\ufb02ife with respect to perturbations in the edge weights. In the current\nalgorithm, we simply select one of these eigenvectors in which a bottleneck has been iden-\nti\ufb01ed, and cut edges within the bottleneck. The algorithm recomputes the eigenvectors and\neigenvalues for the modi\ufb01ed graph, and continues this iterative process until no further\nedges are cut.\n2 From Af\ufb01nities to Markov Chains\nFollowing the formulation in [4], we consider an undirected graph\n, for\n\u000e\u0010\u000f\nrepre-\n\u000f\u001c\u001b\n\u0005\u0012\u0007\u0014\u0013\u0016\u0015\u0017\u0015\u0017\u0015\u0017\u0013\u0019\u0018\n. The edge af\ufb01nities are assumed to be symmetric,\nsents the af\ufb01nity of vertices\n. A Markov chain is de\ufb01ned using these af\ufb01nities by setting the transition\nthat is,\n\u000f\u001c\u001b\n\u0005 \u001e\nfrom vertex\nprobability\n.\n\u000e#\u001d\n!\"\u000f\u001c\u001b\n\u001e$\u000f\u001c\u001b\ngives the normalizing factor which ensures\nwhere\nThat is,\n\u001d%\u0005'&)(+*\n!\"\u000f\u001f\u001b\n&\f\u001d,\u0005.-./\n\u001e\u0010\u000f\u001f\u001b\n46574\n\u001d2\u00053\u0007\n!2\u000f\u001c\u001b\nis given by\n, with elements\n(1)\n\n. In matrix notation, the af\ufb01nities are represented by a symmetric\n\nwith vertices\n\u000f\u001f\u001b\n\nto be proportional to the edge af\ufb01nity,\n\n, and the transition probability matrix\n\nwith non-negative weights\n\n. Here the weight\n\n\u00029\u0005;:\u001f!<\u000f\u001f\u001b\n\n, and edges\n\nis\nand taking edge\n\n\u0013\u0017\u0015\u0016\u0015\u0017\u0015\u0017\u0013D&\nis not in general symmetric.\n\n!\"\u000f\u001f\u001b\n\u001dMGIH\n\u000e<\u0005O:\u001c\u000e\n\u0013\u0017\u0015\u0016\u0015\u0017\u0015E\u0013XGYS\n:VGWS\nFor analysis it is convenient to consider the matrix\n(where\n\n. Suppose the initial probability of the particle being at vertex\n\u000e\f\u001d\nThen, the probability of the particle being initially at vertex\n\u000e\f\u001d\nafter one step is given by the distribution N\n\n\u0013B?C\u0005\nde\ufb01nes the random walk of a particle on the graph\n.\nJK\u0005;\u0007L\u0013\u0017\u0015\u0016\u0015\u0017\u0015\u0017\u0013\u00194\nis\n\u001a\n\u000f\u001c\u001b\n. In matrix notation, the probability of the particle ending up any of the vertices\n, where N\nGTSU\u0005\nP , which is similar to\n\u0002]?\n*\u0019\\\nwith the same\n^ of\n*\u0019\\\nP , and\n8@?\n\u0005`?\n(A*M\\\n(A*M\\\n(A*M\\\n*M\\\nis diagonal.\nis symmetric while\nThe advantage of considering the matrix\nis that the symmetric eigenvalue prob-\nlem is more stable to small perturbations, and is computationally much more tractable.\nSince the matrix\n\n(+*\u0019\\\nmust correspond to an eigenvector\n\nand any eigenvector N\neigenvalue. Note that\ntherefore\n\nis symmetric, it has an orthogonal decomposition of the form:\n\nZ[\u0005.?\ntherefore has the same spectrum as\n\n^ of\nZ_\u0005C?\nis a symmetric\n\n\u0005`?\n\u0002]?\nmatrix since\nover\n\nis as given in Eq. (1)). The matrix\n\n\u0013\u0019\u000e\fP\f\u0013\u0016Q\u0017Q\u0017QR\u0013M\u000e\n\n\u001e$\u000f\u001f\u001b\n\u0002>\u0005.8@?\nmatrix\nThis transition probability matrix\n\nNotice that the\n\n4F524\n\n(+*\u0019\\\n4a524\n\ndiag\n\n:\u001c&\n\n, for\n\nGIH\n\n(A*\n\n*\u0019\\\n\n8@?\n\n\u001d\n=\n\n=E\u0015\n\n\u001d\u0019\u001b\n\n.\n\n\u000f\u001c\u001b\n\n(A*\n\nG%H\n\nZ[\u0005.bTcdbTef\u0013\n\n(2)\n\n\u0002\n\n\n\u0001\n\u0002\n\u0001\n\n\u0007\n\n\u0011\n\u001a\n\u001d\n\u001e\n\u001d\n\u001e\n\u001d\n\u000e\n\u000f\n\u000e\n\u001d\n\u001e\n\u001d\n\u000f\n\u001d\n\u001d\n\u001d\n\u001d\n*\n\u001d\n-\n/\n*\n8\n\u001d\n*\n/\n\u0002\n\u0002\n\n\u001d\n\u001d\n\u001d\nN\n*\n/\n=\nG\n*\n\u0005\n\u0002\nN\n*\n/\n=\nP\n\u0002\n?\nZ\n\u0002\nZ\n?\nP\nN\n\u0002\nP\nP\nP\n?\nP\nP\nZ\n8\n?\nZ\n\u0002\nZ\n\f(a)\n\n(b)\n\n(c)\n\n(d)\n\n(e)\n\nFigure 1: (a-c) Three random images each having an occluder in front of a textured back-\nground. (d-e) A pair of eye images.\n\nwhere\nues\n\n\u0002 are the eigenvectors and\n\nbC\u0005\u0001\n\u0013DIPL\u0013\u0017Q\u0017Q\u0016Q+\u0013D\n, the eigenvalues are real and have an absolute value bounded by 1,\n\u0005;\u0007\n\nis a diagonal matrix of eigenval-\n\u0013\u0017Q\u0016Q\u0017Q+\u0013\n\u0002 sorted in decreasing order. While the eigenvectors have unit length,\nThe eigenvector representation provides a simple way to capture the Markovian relaxation\nprocess [12]. For example, consider propagating the Markov chain for\niterations. The\n, can be represented as:\ntransition matrix after\n\niterations, namely\n\n\u0002\u0005\u0004\u0006\u0007\n\n.\n\n\u0002]\u0004\n\n\u0005'?\n\n*M\\\n\n(+*\u0019\\\n\n. As\n\n(3)\nsteps of\n\u0002\u0006\u0004\n\n, where\n\nbTcf\u0004\n\n,\n\n\u0002\r\f\n\nto\n\n\u0001\b\u0007\n\t\n\n\u000b\u0011\u0010\n\u0005 ?\n\n*M\\\n\n(+*\u0019\\\n\nG@H\n\u000b\u000f\u000e\n\n- /\n\u001d\u00190\n\n.\n\n\u001d\n\n\n\u0004\u0006\u0005\n\u000b and N\n\n. Assuming the graph\n\nafter\n, is N\nG@\u0004\n\nis the probability of starting in vertex\n\n. Similarly, the probability of making the transition in the reverse direction is\n\nTherefore the probability distribution for the particle being at vertex\nthe random walk, given that the initial probability distribution was N\nG,H\nprovides the expansion coef\ufb01cients of the initial\n, where N\n*\u0019\\\n, the Markov chain approaches\nin terms of the eigenvectors of\ndistribution N\nG%H\nis connected with edges\nthe stationary distribution N\nhaving non-zero weights, it is convenient to interpret the Markovian relaxation process as\nperturbations to the stationary distribution,\nis\nassociated with the stationary distribution N\n3 EigenFlows\nLet N\nG@H\nthe graph\nthe transition from vertex\nconditional probability of taking edge\n\nbe an initial probability distribution for a random particle to be at the vertices of\n. By the de\ufb01nition of the Markov chain, recall that the probability of making\n, times the\n, namely\n.\n\u001dD\u001b\nG)H\nis therefore the difference\nto\n\u000e\f\u000f\n(4)\nis antisymmetric (i.e.\nis just the opposite sign\nof the \ufb02ow in the reverse direction. Furthermore, it can be shown that\nfor any\nstationary distribution \u000b\n,\n\u0005_\u0007\n\u0010\u001d\u001a\u0019\nand hence we analyze the rate of decay of these eigen\ufb02ows\nFor illustration purposes we begin by considering an ensemble of random test images\nformed from two independent samples of 2D Gaussian \ufb01ltered white noise (see Fig. 1a-c).\nOne sample is used to form the\nfragment\nof second sample is used for the foreground region. A small constant bias is added to the\nforeground region.\n\n\u000f\u001f\u001b\nThe net \ufb02ow of probability mass along edge\n!\"\u000f\u001f\u001b\nis given by\n\n\u0015K:\n. Therefore, the \ufb02ow is caused by the eigenvectors N\n\ndiag\n\u0002]ef\u0015\n, and therefore\nto\n\u001d\u0019\u001b\n\n. It then follows that the net \ufb02ow of probability mass from vertex\n\nG%H\n=\n\u0015\b\u0005\u0001\u0016\n). This expresses the fact that the \ufb02ow\n\n\u000e#\u001d\n\n\u000e\f\u000f\n-element of the\n\ngiven that the particle is at vertex\n\nbackground image, and a cropped\n\nGIH\n\u001dMGIH\n\u001d\u0014\u0013\n\n\u000f\u001c\u001b\nfor\n\n=f\u0005\n\n\u0015K:\n\n\u000fVG)H\n!U\u001dD\u001b\n\u000f\u001c\u001b\n\nis the\n\ndiag\n\nG%H#=\n\nNotice that\n\nmatrix\n\n\u000e\f\u001d\n\n.\n\n\u0015K:\n\n, where\n\nG,H#=\n\n\u0007\u001c\u001b\n\n5a\u0007\u001c\u001b\n\ndiag\n\nfrom\n\n=T\u0005\u0018\u0017\nwith\n\n5\u001e\u001d\n\nfrom\n\nto\n\n\u000f\u001c\u001b\n\n\u001a\f\u000f\u001f\u001b\nJ\u0010=\n\n4F5\n\nN\n^\n*\n\u0013\nN\n^\nP\nN\n^\n/\nc\n\n\n*\n/\n\u0003\nN\n^\nS\n\u0003\n\u0002\n\nS\n\u0001\n\u0001\n\u0002\n\u0004\nP\nb\nc\n\u0004\nb\ne\n?\nP\n\u0015\n\u000e\n\u000f\n\u0001\n\u0005\nN\nG\nH\n\u0005\n?\nP\nN\n\u0006\nH\n\u0006\nH\n\u0005\nb\ne\n?\nP\nN\nZ\n\u000b\n\u0005\nN\ne\n\nN\nG\nN\nP\n\u0006\n\u0004\n\u001d\nN\n\u0012\n\u001d\n\n*\n\u0005\n\u0007\n\u0012\n\u001d\nP\nN\n^\n\u001d\n\n\u000e\n\u001d\n\u000e\n\u000f\n\u000e\n\u001d\n\u001a\n\u001d\n\u000e\n\u001d\n!\n\u001d\n\u001d\n!\n\u000f\n\u000f\n\u001d\n\u000f\n\u0015\n\u001d\n:\nN\n\u0015\n\u001d\n:\nN\n:\n\u0011\n\u0013\n4\nN\nG\nH\n\u0002\n:\nN\nG\nH\n=\n\u0013\n:\nN\nG\nH\n=\n\u0013\n\u0016\ne\n\u0016\n\u0005\n\u0002\n:\nN\n\u0015\n\u0015\ne\n\u0005\n\u0013\n\u0015\n\u0015\n\u000f\n\u000e\n\u000f\n\u000e\n\u001d\n\u000b\n\u0012\n\u001d\nN\n\u0012\n\u001d\n=\n\u001d\n\f(a)\n\n(b)\n\n(c)\n\nFigure 2: (a) Eigenmode (b) corresponding eigen\ufb02ow (c) gray value at each pixel corre-\nsponds to the maximum of the absolute sensitivities of all the weights on edges connected\nto a pixel (not including itself). Dark pixels indicate high absolute sensitivities.\n\n, where\n\nand\n\n.\n\n\u000f\u001c\u001b\n\n\u0005\u0001\u0003\u0002\u0005\u0004\nand\n\n:\u0007\u0006Y:\n\n\u0006Y:\n\n=M=\n\n\t$:\n\n\u000b\n\t\n\n. The edges in\n\n, where\n\u0005;\u0007L\u0015\n\n\u0006Y:\nis a grey-level standard deviation. We use\n\nA graph clustering problem is formed where each pixel in a test image is associated with a\nare de\ufb01ned by the standard 8-neighbourhood of each\nvertex of the graph\npixel (with pixels at the edges and corners of the image only having 5 and 3 neighbours,\nrespectively). The edge weight between neighbouring vertices\nis given by the\nis the test image brightness\naf\ufb01nity\nis the median\nat pixel N\n\u0005\f\u000b\u000e\r\nabsolute difference of gray levels between all neighbouring pixels and\nThis generative process provides an ensemble of clustering problems which we feel are\nrepresentative of the structure of typical image segmentation problems. In particular, due\nto the smooth variation in gray-levels, there is some variability in the af\ufb01nities within both\nforeground and background regions. Moreover, due to the use of independent samples for\nthe two regions, there is often a signi\ufb01cant step in gray-level across the boundary between\nthe two regions. Finally, due to the small bias used, there is also a signi\ufb01cant chance for\npixels on opposite sides of the boundary to have similar gray-levels, and thus high af\ufb01nities.\nThis latter property ensures that there are some edges with signi\ufb01cant weights between the\ntwo clusters in the graph associated with the foreground and background pixels.\n.\nIn Figure 2 we plot one eigenvector, N\nNotice that the displayed eigenmode is not in general piecewise constant. Rather, the\neigenvector is more like vibrational mode of a non-uniform membrane (in fact, they can\nbe modeled in precisely that way). Also, for all but the stationary distribution, there is a\nsigni\ufb01cant net \ufb02ow between neighbours, especially in regions where the magnitude of the\nspatial gradient of the eigenmode is larger.\n4 Perturbation Analysis of EigenFlows\nAs discussed in the introduction, we seek to identify bottlenecks in the eigen\ufb02ows associ-\nated with long hal\ufb02ives. This notion of identifying bottlenecks is similar to the well-known\nmax-\ufb02ow, min-cut theorem. In particular, for a graph whose edge weights represent maxi-\nmum \ufb02ow capacities between pairs of vertices, instead of the current conditional transition\nprobabilities, the bottleneck edges can be identi\ufb01ed as precisely those edges across which\nthe maximum \ufb02ow is equal to their maximum capacity. However, in the Markov frame-\nwork, the \ufb02ow of probability across an edge is only maximal in the extreme cases for\nwhich the initial probability of being at one of the edge\u2019s endpoints is equal to one, and\nzero at the other endpoint. Thus the max-\ufb02ow criterion is not directly applicable here.\n\nalong with its eigen\ufb02ow,\n\n, of the matrix\n\n\u0015K:\n\nInstead, we show that the desired bottleneck edges can be conveniently identi\ufb01ed by con-\nsidering the sensitivity of the \ufb02ow\u2019s hal\ufb02ife to perturbations of the edge weights (see Fig.\n2c). Intuitively, this sensitivity arises because the \ufb02ow across a bottleneck will have fewer\nalternative routes to take and therefore will be particularly sensitive to changes in the edge\nweights within the bottleneck. In comparison, the \ufb02ow between two vertices in a strongly\ncoupled cluster will have many alternative routes and therefore will not be particularly\n\n\u000e\n\u001d\n\u000e\n\u000f\n\u001e\n\u001d\n\n\u0013\nN\n\b\n\u000f\n=\n\u0013\nN\n\b\n\u001d\nP\nP\n=\n\u0001\nN\n\b\nS\n=\n\b\nS\n\t\n\t\n\n\u000b\n\u001d\n\u0012\nS\n\u0002\nN\n\u0012\nS\n=\n\fsensitive on the precise weight of any single edge.\n\n:\u001c\u0001\n\n\u0001\u0003\u0002\u0005\u0004\n\n\u0001\b\u0002\t\u0004\n\nIn order to pick out larger hal\ufb02ives, we will use one parameter,\n, which is a rough estimate\nof the smallest hal\ufb02ife that one wishes to consider. Since we are interested in perturbations\nwhich signi\ufb01cantly change the current hal\ufb02ife of a mode, we choose to use a logarithmic\nscale in hal\ufb02ife. A simple choice for a function which combines these two effects is\n\n, where\n\nthe hal\ufb02ife of the current eigenmode.\n\n:\u001f\u0001\n\nSuppose we have an eigenvector N\n^ of\na hal\ufb02ife of\n. Consider the effect on\n\u0001F\u0005\n\u0003\u0002\n:D\u0002\nfor the\n\u000f\u001c\u001b\n\u000f\u001f\u001b\nderivative of\n\n. This eigenvector decays with\nof perturbing the af\ufb01nity\n. In particular, we show in the Appendix that the\n\n, with eigenvalue\n\n, satis\ufb01es\n\n\u000f\u001f\u001b\n\n:X\u000bL=\u0019\t\n-edge, to\n=M=\n\n:\u001f\u0001f:\n\n\u000f\u001f\u001b\n\n\u0001\u0003\u0002\u0005\u0004\n\u0001\u0003\u0002\u0005\u0004\n\n\u0001\u0003\u0002\u0005\u0004\n\u0010\u0007\u0006\nwith respect to\u0006\n\u0001\b\u0002\t\u0004\n\u0001\b\u0002\t\u0004\n\u0004\u000b\n\nI=\nJ\u0010=\n\n\t\u0014\u000b\u0014=\n\nd:\u001c\u0001\n\n\u000f\u001f\u001b\n\n\u000f\u001c\u001b\n\n, evaluated at\u0006\n\u001d\u0012\u0011\n\n(5)\n\n:\u001c\u0001\n\n\u001d#=\n\n\u000f\u001f\u001b\n\u000fM\u0013\n\nHere\n\nare the\n\n\u0013\u000e\n\n:X\u000bL=\nelements of eigenvector N\n\nI=\nare degrees of nodes\n(Eq.1). In Figure 2, for a given eigenvector and its \ufb02ow, we plot the maximum of\nabsolute sensitivities of all the weights on edges connected to a pixel (not including itself).\nNote that the sensitivities are large in the bottlenecks at the border of the foreground and\nbackground.\n5 EIGENCUTS: A Basic Clustering Algorithm\nWe select a simple clustering algorithm to test our proposal of using the derivative of the\neigenmode\u2019s hal\ufb02ife for identifying bottleneck edges. Given a value of\n, which is roughly\nthe minimum hal\ufb02ife to consider for any eigenmode, we iterate the following:\n\n:X&\u0010\u000f\u0019\u0013\u0019&\f\u001d#=\n\n\u0011\u0014\u0013\n\nand\n\n, compute the hal\ufb02ife sensitivities,\n\n5. Do non-maximal suppression within each of the computed sensitivities. That is, suppress\nfor some vertex\n\n6. Compute the sum\n\n1. Form the symmetric\n2. Set\n\n3. Compute eigenvectors\n4. For each eigenvector\n\naf\ufb01nity matrix\n\n, and initialize\n\n.\n\n.\n\nof\n\nwith hal\ufb02ife\n\n, with eigenvalues\n\nto be the median of\n\nForm the symmetric matrix\n\n\u0015\u0017\u0016\u0018\u0015\n\u0019\u001b\u001a\u001d\u001c\u001e\u0019\n) , and set a scale factor\n)+*-,\n $\u001c&%('\n\u001f! #\"\n\u001f0 #\"\n .\"\n,?>+@\n,?>+@ .\n\u001f=<\n\u001c\u001e\u001f=<\nIFKLI\nACB\n4\u000bB\n;H\u001a\n49E9E9E\u00124FB\n'FG\nD\u0012N of\nNQP3R\nOTS\nfor each edge in the graph. Here we useR\n\nMc\n\u001ch2\\i\\j\nY:Z\\[^]`_\\ab]\n\u001cWV\u001bX\n #\"\nV+dFe.f\nthe sensitivityU\n) orU\nif there is a strictly more negative valueU\n .\"\n .\"\n) , or somel\nin the neighbourhood ofl\nin the neighbourhood ofl\nN of\n) over all non-suppressed edges\n)8o\np.1:4Cq\u000br\n #\"\n #\"\n. We uset\ni`/\n\u001chnvu\u00056w2\nN8x\nN8x\nN8x\n(i.e. set their af\ufb01nities to 0) for whichU\n)0s\n\u0019y\u001a\np.1:4Cq\u000br\n #\"\n|D\u0001\n\nthis sensitivity was not suppressed during non-maximal suppression.\n\nto produce any sensitivity less than\n\nis maximal.\n\nfor which\n\n, with\n\n:\u001f\u0001\n\n.\n\n.\n\n.\n\n9. If any new edges have been cut, go to 2. Otherwise stop.\n\n7. Select the eigenmode\n8. Cut all edges\nin\n\nare as described previously, other than computing the scaling constant\n\nHere steps\n,\nwhich is used in step\nto provide a scale invariant threshold on the computed sensitivities.\nIn step 4 we only consider eigenmodes with hal\ufb02ives larger than\nbecause\nthis typically eliminates the need to compute the sensitivities for many modes with tiny\nvalues of\n, it is very rare for eigenvectors with\nhal\ufb02ives smaller than\n\nand, because of the\n\n\u0013(z\n\nterm in\n\n\u0005_\u0007\n\n.\n\nfor\n\n.\n\n1$\u001c325476869694:\u0015\nIM69696\u001bKLI\nfor whichU\ni\\/\n\n)\u0018s\n\n .\"\n\nand for which\n\neigenvector.\nIn step 5 we perform a non-maximal suppression on the sensitivities for the\nWe have observed that at strong borders the computed sensitivities can be less than\nin a\nband along the border few pixels thick. This non-maximal suppression allows us to thin this\nregion. Otherwise, many small isolated fragments can be produced in the neighbourhood\nof such strong borders.\n\n\u007f\u0012\u0080C\u0081\n\n|\u0019\u0001\n\n\t\u000b}\n\n\u0001\nH\n\n=\n\u0005\n\u0010\n\u0001\nH\n=\n\u0001\nZ\n\n\u0013\n=\n=\n\u001e\n\u001d\n:\n\u0011\n\u0013\nJ\n=\n\u001e\n\u001d\n\u001d\n\n\u0006\n\u001d\n\u001d\n\u001d\n\u0005\n\u0017\n&\n\u0010\n\u0001\nH\n=\n&\n\u0006\n\u001d\n\u0005\n\n:\n:\n\n\f\n^\n\u000f\n\u000f\n&\n\u000f\n\u0013\n^\n\u001d\n\u0010\n&\nP\n\u0010\n:\n\u0007\n\u0013\n\n^\nP\n\u000f\n&\n\u000f\n\u0010\n^\nP\n\u001d\n&\n\u001d\n:\n^\n^\n:\n\u0011\n\u0013\n^\n:\n\u0011\n\u0013\nJ\n=\n\u0001\nH\n\u0019\n\u0019\n\u001a\n/\n \n;\n\u001a\n\u0019\n\u001a\nD\n,\nD\n@\nD\nI\nJ\n,\nJ\n@\nJ\n'\nI\nB\n;\n\u001a\nO\nU\nN\n)\ng\nN\n)\nN\nk\n\"\nN\n'\nl\nk\n'\n \nm\nn\nU\nN\n\u001a\nN\nt\nB\nD\nm\nt\n\u0007\n{\n\u001b\nH\n|\n\u0001\nS\n\u0001\nH\n\n=\nH\n~\n~\n\fIn step 6 we wish to select one particular eigenmode to base the edge cutting on at this\niteration. The reason for not considering all the modes simultaneously is that we have\nfound the locations of the cuts can vary by a few pixels for different modes. If nearby\nedges are cut as a result of different eigenmodes, then small isolated fragments can result\nin the \ufb01nal clustering. Therefore we wish to select just one eigenmode to base cuts on each\niteration. The particular eigenmode selected can, of course, vary from one iteration to the\nnext.\n\nThe selection strategy in step 6 above picks out the mode which produces the largest\nlinearized increment in\n, where\n\nis the change of af\ufb01nities for any edge left to\notherwise. Other techniques for selecting a particular mode were\n\nThat is, we compute\n\n\u0001\u0003\u0002\u0005\u0004\n\u0001\u0003\u0007\n\nbe cut, and\nalso tried, and they all produced similar results.\n\n\u001e\n\b\n\u000f\u001f\u001b\n\n\u000f\u001f\u001b\n\n:\u001c\u0001\n\n\u0001\b\u0002\t\u0004\n\n\u001e\n\b\n\u000f\u001c\u001b\n\n.\n\nd:\u001c\u0001\n\n\u001e\t\b\n\u000f\u001c\u001b\n\u001e\u000b\b\n\u000f\u001f\u001b\n\ne#f\n\n-means algorithm of [5] with\n\nThis iterative cutting process must eventually terminate since, except for the last iteration,\nedges are cut each iteration and any cut edges are never uncut. When the process does\nterminate, the selected succession of cuts provides a modi\ufb01ed af\ufb01nity matrix\nwhich\nhas well separated clusters. For the \ufb01nal clustering result, we can use either a connected\ncomponents algorithm or the\nset to the number of modes\nhaving large hal\ufb02ives.\n6 Experiments\n-means based spectral\nWe compare the quality of E IGENCUTS with two other methods: a\nclustering algorithm of [5] and an ef\ufb01cient segmentation algorithm proposed in [1] based on\na pairwise region comparison function. Our strategy was to select thresholds that are likely\nto generate a small number of stable partitions. We then varied these thresholds to test the\n-means, we needed to determine the\nquality of partitions. To allow for comparison with\nnumber of clusters\nto be the same as the number of clusters\nthat EIGENCUTS generated. The cluster centers were initialized to be as orthogonal as\npossible [5].\n\na priori. We therefore set\n\n8\f\b\n\nThe \ufb01rst two rows in Fig. 3 show results using E IGENCUTS. A crucial observation with\nEIGENCUTS is that, although the number of clusters changed slightly with a change in\n, the regions they de\ufb01ned were qualitatively preserved across the thresholds and corre-\nsponded to a naive observer\u2019s intuitive segmentation of the image. Notice in the random\nimages the occluder is found as a cluster clearly separated from the background. The per-\nformance on the eye images is also interesting in that the largely uniform regions around\nthe center of the eye remain as part of one cluster.\n\n-means algorithm and the image segmentation algorithm of [1]\nIn comparison, both the\n(rows 3-6 in Fig. 3) show a tendency to divide uniform regions and give partitions that are\nneither stable nor intuitive, despite multiple restarts.\n7 Discussion\nWe have demonstrated that the common piecewise constant approximation to eigenvec-\ntors arising in spectral clustering problems limits the applicability of previous methods to\nsituations in which the clusters are only relatively weakly coupled. We have proposed a\nnew edge cutting criterion which avoids this piecewise constant approximation. Bottleneck\nedges between distinct clusters are identi\ufb01ed through the observed sensitivity of an eigen-\n\ufb02ow\u2019s hal\ufb02ife on changes in the edges\u2019 af\ufb01nity weights. The basic algorithm we propose\nis computationally demanding in that the eigenvectors of the Markov matrix must be re-\ncomputed after each iteration of edge cutting. However, the point of this algorithm is to\nsimply demonstrate the partitioning that can be achieved through the computation of the\nsensitivity of eigen\ufb02ow hal\ufb02ives to changes in edge weights. More ef\ufb01cient updates of the\neigenvalue computation, taking advantage of low-rank changes in the matrix\nfrom one\niteration to the next, or a multi-scale technique, are important areas for further study.\n\nZ\r\b\n\nS\n=\n\u0005\nS\n\u0010\n\u0001\nH\n=\n\n\nS\n\u0005\n-\n\u001d\n\u0004\n_\n\u0006\ng\n\n\u001d\n\n\u001d\n\u0005\n\u0013\n\u001d\n\n\u001d\n\u0005\n\u0017\n\u0003\n\u0003\n\u0003\n\u0003\n\u0003\n\u0003\n\u0001\nH\n\u0003\n\f(a)\n\n(b)\n\n(c)\n\n(d)\n\n(e)\n\nFigure 3: Each column refers to a different image in the dataset shown in Fig. 1. Pairs\nof rows correspond to results from applying: E IGENCUTS with\n\n, the number\nof clusters, is determined by the results of E IGENCUTS (Rows 3&4) and Falsenszwalb &\nHuttenlocher\n\n\u0007L\u0015\n-Means spectral clustering where\n\n(Rows 1&2),\n\n(Rows 5&6).\n\nand\n\n\u0005\u0001\n\u0005\u0003\u0002\n\n\u0013+|\n\n\u001d$\u0013+~\n\n\u000b\n\u0005\n\u001d\n\u0005\n\u0017\n\u0015\n\u000b\n\u0005\n\u0013\n\u0017\n\u0015\n\u0007\n\u0001\nH\n\u0017\n\u0013\n\u001b\n\u0017\n\u0003\n\u0003\n~\n\u0017\n\u0013\n\u001d\n\u0017\n\fAcknowledgements\nWe have bene\ufb01ted from discussions with Sven Dickinson, Sam Roweis, Sageev Oore and Francisco\nEstrada.\nReferences\n[1] P. Felzenszalb and D. Huttenlocher Ef\ufb01ciently Computing a Good Segmentation Internation\n\nJournal on Computer Vision, 1999.\n\n[2] R. Kannan, S. Vempala and A. Vetta On clusterings\u2013good, bad and spectral. Proc. 41st Annual\n\nSymposium on Foundations of Computer Science , 2000.\n\n[3] J. R. Kemeny and J. L. Snell Finite Markov Chains. Van Nostrand, New York, 1960.\n[4] M. Meila and J. Shi A random walks view of spectral segmentation. Proc. International Work-\n\nshop on AI and Statistics , 2001.\n\n[5] A. Ng, M. Jordan and Y. Weiss On Spectral Clustering: analysis and an algorithm NIPS, 2001.\n[6] A. Ng, A. Zheng, and M. Jordan Stable algorithms for link analysis. Proc. 24th Intl. ACM SIGIR\n\nConference, 2001.\n\n[7] A. Ng, A. Zheng, and M. Jordan Link analysis, eigenvectors and stability. Proc. 17th Intl. IJCAI,\n\n[8] P. Perona and W. Freeman A factorization approach to grouping. European Conference on\n\n2001.\n\nComputer Vision, 1998.\n\n[9] A. Pothen Graph partitioning algorithms with applications to scienti\ufb01c computing. Parallel\n\nNumerical Algorithms, D. E. Keyes et al (eds.), Kluwer Academic Press, 1996.\n\n[10] G. L. Scott and H. C. Longuet-Higgins Feature grouping by relocalization of eigenvectors of\n\nthe proximity matrix. Proc. British Machine Vision Conference, pg. 103-108, 1990.\n\n[11]\n\nJ. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transaction on Pattern\n\nAnalysis and Machine Intelligence , 2000.\n\n[12] N. Tishby and N. Slonim Data clustering by Markovian Relaxation and the Information Bot-\n\ntleneck Method. NIPS, v 13, MIT Press, 2001.\n\n[13] Y. Weiss Segmentation using eigenvectors: a unifying view.\n\nInternational Conference on\n\nComputer Vision, 1999.\n\nof an eigenvalue\n\n\u0002\u0004\u0003\u0006\u0005\u0001\u0007\n\nLet\n(Sec 2). As\n\n. So eigenvectors with\n\nObS9r\n\u001c(\u001f\n\nis the modi\ufb01ed af\ufb01nity matrix\n\nand\n\n, where\n:\n\nO\t\b\n,?>M@\n\n. Half-life is de\ufb01ned as the power to which\n\nmust be raised to reduce the\n\nare effectively ignored. It is easy to show that,\n\nis a matrix of all zeros except for a value of\n\nat location\n\nthe nodes\n\nand\n\n(stacked as elements on the diagonal matrix\n\nwhich are relatively large compared to some minimum half-life\n\nAppendix\nWe compute the derivative of the log of half-life\nof the af\ufb01nity matrix\neigenvalue to half, i.e.,\nhalf-lives\nhalf-lives smaller than\n\n\u001c(\u001f\nV! #\"%$\u001f&\n\u0013 having non-zero entries only on the diagonal. Simplifying the expression further, we get\n\nwith respect to an elemento\n28i\u0001 . What we are interested is in seeing signi\ufb01cant changes in those\n\u0003\u0006\u0005\u0001\u0007\np\n\nD\u000e\r\n\u0003\u0006\u0005\u0001\u0007\n\u0003\u000b\u0005\f\u0007\ni\f5r\npCJ\npCJbr\nD be the corresponding eigenvector such that\n\u001c&J\u001dB\n,?>M@ , we can write for all\n1\u0010\u000f\n,?>M@\u000e\u0011\u000b\u0012\n,?>M@\n,?>+@\n,?>M@\nn\u0016\u0015\n\u0019\u0017\u0015\u001d4\n \u0014\u0013\nwhere \u0012\u0019\u0018\u001b\u001a\np\u001d\u001cb4\u001f\u001ebr\nsee Sec 2); and \u0015\n\u0011\u000b\u0012\n,?>M@\n,?>+@\n,?>M@\n,?>+@\nn&B\n,?>+@\n,?>M@\n,?>M@\n,?>+@\n,?>M@\n, and \u0015\nJ\u001dB\n\u0011\u000b\u0012\n,?>M@\n,?>+@\nn(\nJ\u001dB\n,?>M@\n,?>+@\n \u000e\b\nJ\u001dB\n\nUsing the fact that\ndiagonal, the above equation reduces to:\n\n,?>+@\nas both \u0015\n\nThe scalar form of this expression is used in Eq.5.\n\nD\u000e\r\nn&B\n,?>+@\n\u0019y\u001f\n\n,?>+@\n\n\u00137\u001f\n '\u0013\n\n(6)\n\n(7)\n\n(8)\n\n(9)\n\n;\n\nare degrees of\n\n #\"%$\u001f&\n\nand\n\nare\n\n,?>M@\n\nO\nJ\n \n)\n\u0019\nJ\nJ\n]\n\u001c\nO\nO\nS\nO\nS\np\n\u0002\no\n \n)\n\u001c\nr\nJ\n]\n\n\u0002\nJ\n\u0002\no\n \n)\n4\n\u0002\nJ\n\u0002\no\n \n)\n\u001c\nB\n\u0002\n;\n\u0002\no\n \n)\nB\nD\n6\nB\n;\nB\nD\nD\n;\n;\n<\n\u0019\n\u001f\n<\n\u001c\nq\n\u0002\n;\n\u0002\no\n \n)\n<\n \n)\n\b\n\u0012\n)\n\u001f\n<\n\u0019\n\u001f\n<\nn\n\u001f\n<\n2\np\n\u0002\n \n4\n\u0002\n)\nr\n1\nq\n\u001f\n\u001c\n\u0011\ne\n@\n\u0012\n \n \n\b\nV\ng\n@\n\u0012\n)\n)\nB\nD\n\n\u0002\n;\n\u0002\no\n \n)\nB\nD\n\u001c\nB\nD\n\n\u001f\n<\n \n)\n\b\n\u0012\n)\n \n\u0013\n\u001f\n<\nB\nD\nD\n\n\u0015\n\u001f\n\u0011\n\u001f\n<\n\u0019\n\u001f\n<\n\u0013\nB\nD\n\u0011\n\u001f\n<\n\u0019\n\u001f\n<\n\u0015\nB\nD\n6\n\u001f\n<\n<\nB\nD\n\u001c\n;\nB\nD\n\u001c\nD\n\u001f\n\u001c\n\u001f\n\u0015\n\u001f\nB\nD\n\n\u0002\n;\n\u0002\no\n \n)\nB\nD\n\u001c\nB\nD\n\n\u001f\n<\n \n)\n\b\n\u0012\n)\n\u001f\n<\nB\nD\nD\n\n\u0015\n\u001f\nB\nD\n\u001c\nB\nD\n\n\u001f\n<\n\u0011\n\u0012\n \n)\n\b\n\u0012\n)\n \n\u0013\n\u001f\n<\nB\nD\nn\nD\n\n\u0011\n\u0002\n<\n,\n \n\u0012\n \n\u0002\n<\n,\n)\n\u0012\n)\n)\n\u0013\nB\nD\n6\n\f", "award": [], "sourceid": 2183, "authors": [{"given_name": "Chakra", "family_name": "Chennubhotla", "institution": null}, {"given_name": "Allan", "family_name": "Jepson", "institution": null}]}