{"title": "Improved Moves for Truncated Convex Models", "book": "Advances in Neural Information Processing Systems", "page_first": 889, "page_last": 896, "abstract": "We consider the problem of obtaining the approximate maximum a posteriori estimate of a discrete random field characterized by pairwise potentials that form a truncated convex model. For this problem, we propose an improved st-mincut based move making algorithm. Unlike previous move making approaches, which either provide a loose bound or no bound on the quality of the solution (in terms of the corresponding Gibbs energy), our algorithm achieves the same guarantees as the standard linear programming (LP) relaxation. Compared to previous approaches based on the LP relaxation, e.g. interior-point algorithms or tree-reweighted message passing (TRW), our method is faster as it uses only the efficient st-mincut algorithm in its design. Furthermore, it directly provides us with a primal solution (unlike TRW and other related methods which attempt to solve the dual of the LP). We demonstrate the effectiveness of the proposed approach on both synthetic and standard real data problems. Our analysis also opens up an interesting question regarding the relationship between move making algorithms (such as $\\alpha$-expansion and the algorithms presented in this paper) and the randomized rounding schemes used with convex relaxations. We believe that further explorations in this direction would help design efficient algorithms for more complex relaxations.", "full_text": "Improved Moves for Truncated Convex Models\n\nM. Pawan Kumar\n\nP.H.S. Torr\n\nDept. of Engineering Science\n\nDept. of Computing\n\nUniversity of Oxford\npawan@robots.ox.ac.uk\n\nOxford Brookes University\n\nphiliptorr@brookes.ac.uk\n\nAbstract\n\nWe consider the problem of obtaining the approximate maximum a posteriori es-\ntimate of a discrete random \ufb01eld characterized by pairwise potentials that form a\ntruncated convex model. For this problem, we propose an improved st-MINCUT\nbased move making algorithm. Unlike previous move making approaches, which\neither provide a loose bound or no bound on the quality of the solution (in terms\nof the corresponding Gibbs energy), our algorithm achieves the same guaran-\ntees as the standard linear programming (LP) relaxation. Compared to previ-\nous approaches based on the LP relaxation, e.g. interior-point algorithms or tree-\nreweighted message passing (TRW), our method is faster as it uses only the ef\ufb01-\ncient st-MINCUT algorithm in its design. Furthermore, it directly provides us with\na primal solution (unlike TRW and other related methods which solve the dual\nof the LP). We demonstrate the effectiveness of the proposed approach on both\nsynthetic and standard real data problems.\nOur analysis also opens up an interesting question regarding the relationship be-\ntween move making algorithms (such as \u03b1-expansion and the algorithms pre-\nsented in this paper) and the randomized rounding schemes used with convex re-\nlaxations. We believe that further explorations in this direction would help design\nef\ufb01cient algorithms for more complex relaxations.\n\n1 Introduction\nDiscrete random \ufb01elds are a powerful tool for formulating several problems in Computer Vision\nsuch as stereo reconstruction, segmentation, image stitching and image denoising [22]. Given data\nD (e.g. an image or a video), random \ufb01elds model the probability of a set of random variables v,\ni.e. either the joint distribution of v and D as in the case of Markov random \ufb01elds (MRF) [2] or the\nconditional distribution of v given D as in the case of conditional random \ufb01elds (CRF) [18]. The\nword \u2018discrete\u2019 refers to the fact that each of the random variables va \u2208 v = {v0,\u00b7\u00b7\u00b7 , vn\u22121} can\ntake one label from a discrete set l = {l0,\u00b7\u00b7\u00b7 , lh\u22121}. Throughout this paper, we will assume a MRF\nframework while noting that our results are equally applicable for an CRF.\nAn MRF de\ufb01nes a neighbourhood relationship (denoted by E) over the random variables such that\n(a, b) \u2208 E if, and only if, va and vb are neighbouring random variables. Given an MRF, a labelling\nrefers to a function f such that f : {0,\u00b7\u00b7\u00b7 , n\u2212 1} \u2212\u2192 {0,\u00b7\u00b7\u00b7 , h\u2212 1}. In other words, the function\nf assigns to each random variable va \u2208 v, a label lf (a) \u2208 l. The probability of the labelling is\ngiven by the following Gibbs distribution: Pr(f, D|\u03b8) = exp(\u2212Q(f, D; \u03b8))/Z(\u03b8), where \u03b8 is the\nparameter of the MRF and Z(\u03b8) is the normalization constant (i.e. the partition function). Assuming\na pairwise MRF, the Gibbs energy is given by:\n\u03b81\n\n(1)\n\n\u03b82\nab;f (a)f (b),\n\nQ(f, D; \u03b8) = Xva \u2208v\n\na;f (a) + X(a,b)\u2208E\n\na;f (a) and \u03b82\n\nwhere \u03b81\nab;f (a)f (b) are the unary and pairwise potentials respectively. The superscripts\n\u20181\u2019 and \u20182\u2019 indicate that the unary potential depends on the labelling of one random variable at a\ntime, while the pairwise potential depends on the labelling of two neighbouring random variables.\nClearly, the labelling f which maximizes the posterior Pr(f, D|\u03b8) can be obtained by minimizing\nthe Gibbs energy. The problem of obtaining such a labelling f is known as maximum a posteriori\n\n1\n\n\f(MAP) estimation. In this paper, we consider the problem of MAP estimation of random \ufb01elds where\nthe pairwise potentials are de\ufb01ned by truncated convex models [4]. Formally speaking, the pairwise\npotentials are of the form\n\n\u03b82\nab;f (a)f (b) = wab min{d(f (a) \u2212 f (b)), M}\n\n(2)\nwhere wab \u2265 0 for all (a, b) \u2208 E, d(\u00b7) is a convex function and M > 0 is the truncation factor.\nRecall that, by the de\ufb01nition of Ishikawa [9], a function d(\u00b7) de\ufb01ned at discrete points (speci\ufb01cally\nover integers) is convex if, and only if, d(x + 1)\u2212 2d(x) + d(x\u2212 1) \u2265 0, for all x \u2208 Z. It is assumed\nthat d(x) = d(\u2212x). Otherwise, it can be replaced by (d(x) + d(\u2212x))/2 without changing the Gibbs\nenergy of any of the possible labellings of the random \ufb01eld [23]. Examples of pairwise potentials of\nthis form include the truncated linear metric and the truncated quadratic semi-metric, i.e.\n\n\u03b82\nab;f (a)f (b) = wab min{|f (a) \u2212 f (b)|, M}, \u03b82\n\nab;f (a)f (b) = wab min{(f (a) \u2212 f (b))2, M}.\n\n(3)\n\nBefore proceeding further, we would like to note here that the method presented in this paper can be\ntrivially extended to truncated submodular models (a generalization of truncated convex models).\nHowever, we will restrict our discussion to truncated convex models for two reasons: (i) it makes\nthe analysis of our approach easier; and (ii) truncated convex pairwise potentials are commonly\nused in several problems such as stereo reconstruction, image denoising and inpainting [22]. Note\nthat in the absence of a truncation factor (i.e. when we only have convex pairwise potentials) the\nexact MAP estimation can be obtained ef\ufb01ciently using the methods of Ishikawa [9] or Veksler [23].\nHowever, minimizing the Gibbs energy in the presence of a truncation factor is well-known to be\nNP-hard. Given their widespread use, it is not surprising that several approximate MAP estimation\nalgorithms have been proposed in the literature for the truncated convex model. Below, we review\nsuch algorithms.\n1.1 Related Work\nGiven a random \ufb01eld with truncated convex pairwise potentials, Felzenszwalb and Huttenlocher [6]\nimproved the ef\ufb01ciency of the popular max-product belief propagation (BP) algorithm [19] to obtain\nthe MAP estimate. BP provides the exact MAP estimate when the neighbourhood structure E of the\nMRF de\ufb01nes a tree (i.e. it contains no loops). However, for a general MRF, BP provides no bounds on\nthe quality of the approximate MAP labelling obtained. In fact, it is not even guaranteed to converge.\n\nThe results of [6] can be used directly to speed-up the tree-reweighted message passing algorithm\n(TRW) [24] and its sequential variant TRW-S [10]. Both TRW and TRW-S attempt to optimize the\nLagrangian dual of the standard linear programming (LP) relaxation of the MAP estimation prob-\nlem [5, 15, 21, 24]. Unlike BP and TRW, TRW-S is guaranteed to converge. However, it is well-\nknown that TRW-S and other related algorithms [7, 13, 25] suffer from the following problems: (i)\nthey are slower than algorithms based on ef\ufb01cient graph-cuts [22]; and (ii) they only provide a dual\nsolution [10]. The primal solution (i.e. the labelling f ) is often obtained from the dual solution in an\nunprincipled manner1. Furthermore, it was also observed that, unlike graph-cuts based approaches,\nTRW-S does not work well when the random \ufb01eld models long range interactions (i.e. when the\nneighbourhood relationship E is highly connected) [11]. However, due to the lack of experimental\nresults, it is not clear whether this observation applies to the methods described in [7, 13, 25].\n\nAnother way of solving the LP relaxation is to resort to interior point algorithms [3]. Although\ninterior point algorithms are much slower in practice than TRW-S, they have the advantage of pro-\nviding the primal (possibly fractional) solution of the LP relaxation. Chekuri et al. [5] showed that\nwhen using certain randomized rounding schemes on the primal solution (to get the \ufb01nal labelling\nf ), the following guarantees hold true: (i) for Potts model (i.e. d(f (a) \u2212 f (b)) = |f (a) \u2212 f (b)|\nand M = 1), we obtain a multiplicative bound2 of 2; (ii) for the truncated linear metric (i.e.\n\n1We note here that the recently proposed algorithm in [20] directly provides the primal solution. However,\n\nit is much slower than the methods which solve the dual.\n\n2Let f be the labelling obtained by an algorithm A (e.g. in this case the LP relaxation followed by the\nrounding scheme) for a class of MAP estimation problems (e.g. in this case when the pairwise potentials form a\nPotts model). Let f \u2217 be the optimal labelling. The algorithm A is said to achieve a multiplicative bound of \u03c3,\nif for every instance in the class of MAP estimation problems the following holds true:\n\nwhere E(\u00b7) denotes the expectation of its argument under the rounding scheme.\n\nE\u201e Q(f, D; \u03b8)\n\nQ(f \u2217, D; \u03b8)\u00ab \u2264 \u03c3,\n\n2\n\n\fInitialization\n- Initialize the labelling to some function f1. For example, f1(a) = 0 for all va \u2208 v.\nIteration\n- Choose an interval Im = [im + 1, jm] where (jm \u2212 im) = L such that d(L) \u2265 M.\n- Move from current labelling fm to a new labelling fm+1 such that\n\nfm+1(a) = fm(a) or fm+1(a) \u2208 Im, \u2200va \u2208 v.\n\nThe new labelling is obtained by solving the st-MINCUT problem on a graph described in \u00a7 2.1.\nTermination\n- Stop when there is no further decrease in the Gibbs energy for any interval Im.\n\nTable 1: Our Algorithm. As is typical with move making methods, our approach iteratively goes\nfrom one labelling to the next by solving an st-MINCUT problem. It converges when there remain no\nmoves which reduce the Gibbs energy further.\n\nd(f (a) \u2212 f (b)) = |f (a) \u2212 f (b)| and a general M > 0), we obtain a multiplicative bound of\n2 + \u221a2; and (iii) for the truncated quadratic semi-metric (i.e. d(f (a) \u2212 f (b)) = (f (a) \u2212 f (b))2 and\na general M > 0), we obtain a multiplicative bound of O(\u221aM ).\nThe algorithms most related to our approach are the so-called move making methods which rely on\nsolving a series of graph-cut (speci\ufb01cally st-MINCUT) problems. Move making algorithms start with\nan initial labelling f0 and iteratively minimize the Gibbs energy by moving to a better labelling. At\neach iteration, (a subset of) random variables have the option of either retaining their old label or\ntaking a new label from a subset of the labels l. For example, in the \u03b1\u03b2-swap algorithm [4] the\nvariables currently labelled l\u03b1 or l\u03b2 can either retain their labels or swap them (i.e. some variables\nlabelled l\u03b1 can be relabelled as l\u03b2 and vice versa). The recently proposed range move algorithm [23]\nmodi\ufb01es this approach such that any variable currently labelled li where i \u2208 [\u03b1, \u03b2] can be assigned\nany label lj where j \u2208 [\u03b1, \u03b2]. Note that the new label lj can be different from the old label li, i.e.\ni 6= j. Both these algorithms (i.e. \u03b1\u03b2-swap and range move) do not provide any guarantees on the\nquality of the solution.\n\nIn contrast, the \u03b1-expansion algorithm [4] (where each variable can either retain its label or get\nassigned the label l\u03b1 at an iteration) provides a multiplicative bound of 2 for the Potts model and\n2M for the truncated linear metric. Gupta and Tardos [8] generalized the \u03b1-expansion algorithm for\nthe truncated linear metric and obtained a multiplicative bound of 4. Komodakis and Tziritas [14]\ndesigned a primal-dual algorithm which provides a bound of 2M for the truncated quadratic semi-\nmetric. Note that these bounds are inferior to the bounds obtained by the LP relaxation. However,\nall the above move making algorithms use only a single st-MINCUT at each iteration and are hence,\nmuch faster than interior point algorithms, TRW, TRW-S and BP.\n\n1.2 Our Results\nWe further extend the approach of Gupta and Tardos [8] in two ways (section 2). The \ufb01rst extension\nallows us to handle any truncated convex model (and not just truncated linear). The second extension\nallows us to consider a potentially larger subset of labels at each iteration compared to [8]. As will\nbe seen in the subsequent analysis (\u00a72.2), these two extensions allow us to solve the MAP estimation\nproblem ef\ufb01ciently using st-MINCUT whilst obtaining the same guarantees as the LP relaxation [5].\nFurthermore, our approach does not suffer from the problems of TRW-S mentioned above. In order\nto demonstrate its practical use, we provide a favourable comparison of our method with several\nstate of the art MAP estimation algorithms (section 3).\n\nm+1(a) = fm(a) or f \u2032\n\n2 Description of the Algorithm\nTable 1 describes the main steps of our approach. Note that unlike the methods described in [4, 23]\nwe will not be able to obtain the optimal move at each iteration. In other words, if in the mth iteration\nwe move from label fm to fm+1 then it is possible that there exists another labelling f \u2032\nm+1 such\nthat f \u2032\nm+1, D; \u03b8) < Q(fm+1, D; \u03b8).\nHowever, our analysis in the next section shows that we are still able to reduce the Gibbs energy\nsuf\ufb01ciently at each iteration so as to obtain the guarantees of the LP relaxation.\nWe now turn our attention to designing a method of moving from labelling fm to fm+1. Our ap-\nproach relies on constructing a graph such that every st-cut on the graph corresponds to a labelling\nf \u2032 of the random variables which satis\ufb01es: f \u2032(a) = fm(a) or f \u2032(a) \u2208 Im, for all va \u2208 v. The\nnew labelling fm+1 is obtained in two steps: (i) we obtain a labelling f \u2032 which corresponds to the\n\nm+1(a) \u2208 Im for all va \u2208 v and Q(f \u2032\n\n3\n\n\fst-MINCUT on our graph; and (ii) we choose the new labelling fm+1 as\n\nfm+1 =(cid:26) f \u2032\n\nif\n\nfm otherwise.\n\nQ(f \u2032, D; \u03b8) \u2264 Q(fm, D; \u03b8),\n\n(4)\n\nBelow, we provide the details of the graph construction.\n2.1 Graph Construction\nAt each iteration of our algorithm, we are given an interval Im = [im + 1, jm] of L labels (i.e. (jm\u2212\nim) = L) where d(L) \u2265 M . We also have the current labelling fm for all the random variables.\nWe construct a directed weighted graph (with non-negative weights) Gm = {Vm,Em, cm(\u00b7,\u00b7)} such\nthat for each va \u2208 v, we de\ufb01ne vertices {aim+1, aim+2,\u00b7\u00b7\u00b7 , ajm} \u2208 Vm. In addition, as is the case\nwith every st-MINCUT problem, there are two additional vertices called terminals which we denote\nby s (the source) and t (the sink). The edges e \u2208 Em with capacity (i.e. weight) cm(e) are of two\ntypes: (i) those that represent the unary potentials of a labelling corresponding to an st-cut in the\ngraph and; (ii) those that represent the pairwise potentials of the labelling.\n\nFigure 1: Part of the graph Gm containing the terminals and the vertices corresponding to the\nvariable va. The edges which represent the unary potential of the new labelling are also shown.\nRepresenting Unary Potentials For all random variables va \u2208 v, we de\ufb01ne the following\nedges which belong to the set Em:\n(i) For all k \u2208 [im + 1, jm), edges (ak, ak+1) have ca-\npacity cm(ak, ak+1) = \u03b81\na;k; (ii) For all k \u2208 [im + 1, jm), edges (ak+1, ak) have capacity\na;jm; (iv) Edges (t, ajm )\ncm(ak+1, ak) = \u221e; (iii) Edges (ajm , t) have capacity cm(ajm, t) = \u03b81\na;fm(a) if\nhave capacity cm(t, ajm ) = \u221e; (v) Edges (s, aim+1) have capacity cm(s, aim+1) = \u03b81\nfm(a) /\u2208 Im and \u221e otherwise; and (vi) Edges (aim+1, s) have capacity cm(aim+1, s) = \u221e.\nFig. 1 shows the above edges together with their capacities for one random variable va. Note that\nthere are two types of edges in the above set: (i) with \ufb01nite capacity; and (ii) with in\ufb01nite capacity.\nAny st-cut with \ufb01nite cost3 contains only one of the \ufb01nite capacity edges for each random variable\nva. This is because if an st-cut included more than one \ufb01nite capacity edge, then by construction it\nmust include at least one in\ufb01nite capacity edge thereby making its cost in\ufb01nite [9, 23]. We interpret\na \ufb01nite cost st-cut as a relabelling of the random variables as follows:\n\nf \u2032(a) =( k\n\njm\n\nfm(a)\n\nif st-cut includes edge (ak, ak+1) where k \u2208 [im + 1, jm),\nif st-cut includes edge (ajm , t),\nif st-cut includes edge (s, aim+1).\n\n(5)\n\nNote that the sum of the unary potentials for the labelling f \u2032 is exactly equal to the cost of the st-cut\nover the edges de\ufb01ned above. However, the Gibbs energy of the labelling also includes the sum of\nthe pairwise potentials (as shown in equation (1)). Unlike the unary potentials we will not be able\nto model the sum of pairwise potentials exactly. However, we will be able to obtain its upper bound\nusing the cost of the st-cut over the following edges.\nRepresenting Pairwise Potentials For all neighbouring random variables va and vb, i.e. (a, b) \u2208\nE, we de\ufb01ne edges (ak, bk\u2032 ) \u2208 Em where either one or both of k and k\u2032 belong to the set (im + 1, jm]\n(i.e. at least one of them is different from im + 1). The capacity of these edges is given by\n(6)\nThe above capacity is non-negative due to the fact that wab \u2265 0 and d(\u00b7) is convex. Furthermore,\nwe also add the following edges:\n\n(d(k \u2212 k\u2032 + 1) \u2212 2d(k \u2212 k\u2032) + d(k \u2212 k\u2032 \u2212 1)) .\n\ncm(ak, bk\u2032 ) =\n\nwab\n2\n\ncm(ak, ak+1) = wab\ncm(bk\u2032 , bk\u2032+1) = wab\n\n2 (d(L \u2212 k + im) + d(k \u2212 im)) ,\u2200(a, b) \u2208 E, k \u2208 [im + 1, jm)\n2 (d(L \u2212 k\u2032 + im) + d(k\u2032 \u2212 im)) ,\u2200(a, b) \u2208 E, k\u2032 \u2208 [im + 1, jm)\ncm(ajm , t) = cm(bjm, t) = wab\n\n2 d(L),\u2200(a, b) \u2208 E.\n\n(7)\n\n3Recall that the cost of an st-cut is the sum of the capacities of the edges whose starting point lies in the set\n\nof vertices containing the source s and whose ending point lies in the set of vertices containing the sink t.\n\n4\n\n\f(b)\n\n(c)\n\n(d)\n\n(a)\n\nFigure 2: (a) Edges that are used to represent the pairwise potentials of two neighbouring random\nvariables va and vb are shown. Undirected edges indicate that there are opposing edges in both\ndirections with equal capacity (as given by equation 6). Directed dashed edges, with capacities\nshown in equation (7), are added to ensure that the graph models the convex pairwise potentials\ncorrectly. (b) An additional edge is added when fm(a) \u2208 Im and fm(b) /\u2208 Im. The term \u03baab =\nwabd(L). (c) A similar additional edge is added when fm(a) /\u2208 Im and fm(b) \u2208 Im. (d) Five\nedges, with capacities as shown in equation (8), are added when fm(a) /\u2208 Im and fm(b) /\u2208 Im.\nUndirected edges indicate the presence of opposing edges with equal capacity.\n\nNote that in [23] the graph obtained by the edges in equations (6) and (7) was used to \ufb01nd the exact\nMAP estimate for convex pairwise potentials. A proof that the above edges exactly model convex\npairwise potentials up to an additive constant \u03baab = wabd(L) can be found in [17]. However, we\nare concerned with the NP-hard case where the pairwise potentials are truncated. In order to model\nthis case, we incorporate some additional edges to the above set. These additional edges are best\ndescribed by considering the following three cases for all (a, b) \u2208 E.\n\u2022 If fm(a) \u2208 Im and fm(b) \u2208 Im then we do not add any more edges in the graph (see Fig. 2(a)).\n\u2022 If fm(a) \u2208 Im and fm(b) /\u2208 Im then we add an edge (aim+1, bim+1) with capacity wabM +\u03baab/2,\nwhere \u03baab = wabd(L) is a constant for a given pair of neighbouring random variables (a, b) \u2208 E\n(see Fig. 2(b)). Similarly, if fm(a) /\u2208 Im and fm(b) \u2208 Im then we add an edge (bim+1, aim+1) with\ncapacity wabM + \u03baab/2 (see Fig. 2(c)).\n\u2022 If fm(a) /\u2208 Im and fm(b) /\u2208 Im, we introduce a new vertex pab. Using this vertex pab, \ufb01ve edges\nare de\ufb01ned with the following capacities (see Fig. 2(d)):\n\ncm(aim+1, pab) = cm(pab, aim+1) = cm(bim+1, pab) = cm(pab, bim+1) = wabM + \u03baab/2,\n\ncm(s, pab) = \u03b82\n\nab;fm(a),fm(b) + \u03baab.\n\n(8)\nThis completes our graph construction. Given the graph Gm we solve the st-MINCUT problem which\nprovides us with a labelling f \u2032 as described in equation (5). The new labelling fm+1 is obtained\nusing equation (4). Note that our graph construction is similar to that of Gupta and Tardos [8] with\ntwo notable exceptions: (i) we can handle any general truncated convex model and not just truncated\nlinear as in the case of [8]. This is achieved in part by using the graph construction of [23]; and (ii)\nwe have the freedom to choose the value of L, while [8] \ufb01xed this value to M . A logical choice\nwould be to use that value of L which minimizes the worst case multiplicative bound for a particular\nclass of problems. The following properties provide such a value of L for both the truncated linear\nand the truncated quadratic models. Our worst case multiplicative bounds are exactly those achieved\nby the LP relaxation (see [5]).\n\n5\n\n\f2.2 Properties of the Algorithm\nFor the above graph construction, the following properties hold true:\n\u2022 The cost of the st-MINCUT provides an upper bound on the Gibbs energy of the labelling f \u2032 and\nhence, on the Gibbs energy of fm+1 (see section 2.2 of [17]).\n\u2022 For the truncated linear metric, our algorithm obtains a multiplicative bound of 2 + \u221a2 using\nL = \u221a2M (see section 3, Theorem 1, of [17]). Note that this bound is better than those obtained by\n\u03b1-expansion [4] (i.e. 2M ) and its generalization [8] (i.e. 4).\n\u2022 For the truncated quadratic semi-metric, our algorithm obtains a multiplicative bound of O(\u221aM )\nusing L = \u221aM (see section 3, Theorem 2, of [17]). Note that both \u03b1-expansion and the approach\nof Gupta and Tardos provide no bounds for the above case. The primal-dual method of [14] obtains\na bound of 2M which is clearly inferior to our guarantees.\n3 Experiments\nWe tested our approach using both synthetic and standard real data. Below, we describe the experi-\nmental setup and the results obtained in detail.\n3.1 Synthetic Data\nExperimental Setup We used 100 random \ufb01elds for both the truncated linear and truncated\nquadratic models. The variables v and neighbourhood relationship E of the random \ufb01elds described\na 4-connected grid graph of size 50 \u00d7 50. Note that 4-connected grid graphs are widely used to\nmodel several problems in Computer Vision [22]. Each variable was allowed to take one of 20 pos-\nsible labels, i.e. l = {l0, l1,\u00b7\u00b7\u00b7 , l19}. The parameters of the random \ufb01eld were generated randomly.\nSpeci\ufb01cally, the unary potentials \u03b81\na;i were sampled uniformly from the interval [0, 10] while the\nweights wab, which determine the pairwise potentials, were sampled uniformly from [0, 5]. The\nparameter M was also chosen randomly while taking care that d(5) \u2264 M \u2264 d(10).\nResults Fig. 3 shows the results obtained by our approach and \ufb01ve other state of the art algorithms:\n\u03b1\u03b2-swap, \u03b1-expansion, BP, TRW-S and the range move algorithm of [23]. We used publicly available\ncode for all previously proposed approaches with the exception of the range move algorithm4. As can\nbe seen from the \ufb01gure, the most accurate approach is the method proposed in this paper, followed\nclosely by the range move algorithm. Recall that, unlike range move, our algorithm is guaranteed to\nprovide the same worst case multiplicative bounds as the LP relaxation. As expected, both the range\nmove algorithm and our method are slower than \u03b1\u03b2-swap and \u03b1-expansion (since each iteration\ncomputes an st-MINCUT on a larger graph). However, they are faster than TRW-S, which attempts to\nminimize the LP relaxation, and BP. We note here that our implementation does not use any clever\ntricks to speed up the max-\ufb02ow algorithm (such as those described in [1]) which can potentially\ndecrease the running time by orders of magnitude.\n3.2 Real Data - Stereo Reconstruction\nGiven two epipolar recti\ufb01ed images D1 and D2 of the same scene, the problem of stereo reconstruc-\ntion is to obtain a correspondence between the pixels of the images. This problem can be modelled\nusing a random \ufb01eld whose variables correspond to pixels of one image (say D1) and take labels\nfrom a set of disparities l = {0, 1,\u00b7\u00b7\u00b7 , h \u2212 1}. A disparity value i for a random variable a denoting\npixel (x, y) in D1 indicates that its corresponding pixel lies in (x + i, y) in the second image.\nFor the above random \ufb01eld formulation, the unary potentials were de\ufb01ned as in [22] and were trun-\ncated at 15. As is typically the case, we chose the neighbourhood relationship E to de\ufb01ne a 4-\nneighbourhood grid graph. The number of disparities h was set to 20. We experimented using the\nfollowing truncated convex potentials:\n\nab;ij = 50 min{|i \u2212 j|, 10}, \u03b82\n\u03b82\n\nab;ij = 50 min{(i \u2212 j)2, 100}.\n\n(9)\n\nThe above form of pairwise potentials encourage neighbouring pixels to take similar disparity values\nwhich corresponds to our expectations of \ufb01nding smooth surfaces in natural images. Truncation of\npairwise potentials is essential to avoid oversmoothing, as observed in [4, 23]. Note that using\nspatially varying weights wab provides better results. However, the main aim of this experiment is\nto demonstrate the accuracy and speed of our approach and not to design the best possible Gibbs\n\n4When using \u03b1-expansion with the truncated quadratic semi-metric, all edges with negative capacities in\n\nthe graph construction were removed, similar to the experiments in [22].\n\n6\n\n\f(a)\n\n(b)\n\nFigure 3: Results of the synthetic experiment. (a) Truncated linear metric. (b) Truncated quadratic\nsemi-metric. The x-axis shows the time taken in seconds. The y-axis shows the average Gibbs energy\nobtained over all 100 random \ufb01elds using the six algorithms. The lower blue curve is the value of\nthe dual obtained by TRW-S. In both the cases, our method and the range move algorithm provide\nthe most accurate solution and are faster than TRW-S and BP.\n\nenergy. Table 2 provides the value of the Gibbs energy and the total time taken by all the approaches\nfor a standard stereo pair (Teddy). As in the case of the synthetic experiments, the range move\nalgorithm and our method provide the most accurate solutions while taking less time than TRW-S and\nBP. Additional experiments on other stereo pairs with similar observations about the performances\nof the various algorithms can be found in [17]. However, we would again like to emphasize that\nunlike our method the range move algorithm provides no theoretical guarantees about the quality of\nthe solution.\n\nAlgorithm\n\u03b1\u03b2-swap\n\n\u03b1-expansion\n\nTRW-S\n\nBP\n\nRange Move\nOur Approach\n\nEnergy-1\n3678200\n3677950\n3677578\n3789486\n3686844\n3613003\n\nTime-1(s)\n\n18.48\n11.73\n131.65\n272.06\n97.23\n120.14\n\nEnergy-2\n3707268\n3687874\n3679563\n5180705\n3679552\n3679552\n\nTime-2(s)\n\n20.25\n8.79\n332.94\n331.36\n141.78\n191.20\n\nTable 2: The energy obtained and the time taken by the algorithms used in the stereo reconstruction\nexperiment with the Teddy image pair. Columns 2 and 3 : truncated linear metric. Columns 4 and\n5: truncated quadratic semi-metric.\n\n4 Discussion\nWe have presented an st-MINCUT based algorithm for obtaining the approximate MAP estimate of\ndiscrete random \ufb01elds with truncated convex pairwise potentials. Our method improves the mul-\ntiplicative bound for the truncated linear metric compared to [4, 8] and provides the best known\nbound for the truncated quadratic semi-metric. Due to the use of only the st-MINCUT problem in\nits design, it is faster than previous approaches based on the LP relaxation. In fact, its speed can\nbe further improved by a large factor using clever techniques such as those described in [12] (for\nconvex unary potentials) and/or [1] (for general unary potentials). Furthermore, it overcomes the\nwell-known de\ufb01ciencies of TRW and its variants. Experiments on synthetic and real data problems\ndemonstrate its effectiveness compared to several state of the art algorithms.\nThe analysis in \u00a72.2 shows that, for the truncated linear and truncated quadratic models, the bound\nachieved by our move making algorithm over intervals of any length L is equal to that of rounding\nthe LP relaxation\u2019s optimal solution using the same intervals [5]. This equivalence also extends to\nthe Potts model (in which case \u03b1-expansion provides the same bound as the LP relaxation). A natural\nquestion would be to ask about the relationship between move making algorithms and the rounding\nschemes used in convex relaxations. Note that despite recent efforts [14] which analyze certain move\nmaking algorithms in the context of primal-dual approaches for the LP relaxation, not many results\n\n7\n\n\fare known about their connection with randomized rounding schemes. Although the discussion in\n\u00a72.2 cannot be trivially generalized to all random \ufb01elds, it offers a \ufb01rst step towards answering this\nquestion. We believe that further exploration in this direction would help improve the understanding\nof the nature of the MAP estimation problem, e.g. how to derandomize approaches based on convex\nrelaxations. Furthermore, it would also help design ef\ufb01cient move making algorithms for more\ncomplex relaxations such as those described in [16].\n\nAcknowledgments The \ufb01rst author was supported by the EU CLASS project and EPSRC grant\nEP/C006631/1(P). The second author is in receipt of a Royal Society Wolfson Research Merit\nAward, and would like to acknowledge support from the Royal Society and Wolfson foundation.\n\nReferences\n\n[1] K. Alahari, P. Kohli, and P. H. S. Torr. Reduce, reuse & recycle: Ef\ufb01ciently solving multi-label MRFs.\n\nIn CVPR, 2008.\n\n[2] J. Besag. On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, Series B,\n\n48:259\u2013302, 1986.\n\n[3] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.\n[4] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. PAMI,\n\n23(11):1222\u20131239, 2001.\n\n[5] C. Chekuri, S. Khanna, J. Naor, and L. Zosin. A linear programming formulation and approximation\n\nalgorithms for the metric labelling problem. SIAM Journal on Disc. Math., 18(3):606\u2013635, 2005.\n\n[6] P. Felzenszwalb and D. Huttenlocher. Ef\ufb01cient belief propagation for early vision. In CVPR, 2004.\n[7] A. Globerson and T. Jaakkola.\n\nFixing max-product: Convergent message passing for MAP LP-\n\nrelaxations. In NIPS, 2007.\n\n[8] A. Gupta and E. Tardos. A constant factor approximation algorithm for a class of classi\ufb01cation problems.\n\nIn STOC, 2000.\n\n[9] H. Ishikawa. Exact optimization for Markov random \ufb01elds with convex priors. PAMI, 25(10):1333\u20131336,\n\nOctober 2003.\n\n[10] V. Kolmogorov. Convergent\n\n28(10):1568\u20131583, 2006.\n\ntree-reweighted message passing for energy minimization.\n\nPAMI,\n\n[11] V. Kolmogorov and C. Rother. Comparison of energy minimization algorithms for highly connected\n\ngraphs. In ECCV, pages II: 1\u201315, 2006.\n\n[12] V. Kolmogorov and A. Shioura. New algorithms for the dual of the convex cost network \ufb02ow problem\n\nwith applications to computer vision. Technical report, University College London, 2007.\n\n[13] N. Komodakis, N. Paragios, and G. Tziritas. MRF optimization via dual decomposition: Message-passing\n\nrevisited. In ICCV, 2007.\n\n[14] N. Komodakis and G. Tziritas. Approximate labeling via graph-cuts based on linear programming. PAMI,\n\n2007.\n\n[15] A. Koster, C. van Hoesel, and A. Kolen. The partial constraint satisfaction problem: Facets and lifting\n\ntheorems. Operations Research Letters, 23(3-5):89\u201397, 1998.\n\n[16] M. P. Kumar, V. Kolmogorov, and P. H. S. Torr. An analysis of convex relaxations for MAP estimation.\n\nIn NIPS, 2007.\n\n[17] M. P. Kumar and P. H. S. Torr. Improved moves for truncated convex models. Technical report, University\n\nof Oxford, 2008.\n\n[18] J. Lafferty, A. McCallum, and F. Pereira. Conditional random \ufb01elds: Probabilistic models for segmenting\n\nand labelling sequence data. In ICML, 2001.\n\n[19] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kauff-\n\nman, 1988.\n\n[20] P. Ravikumar, A. Agarwal, and M. Wainwright. Message-passing for graph-structured linear programs:\n\nProximal projections, convergence and rounding schemes. In ICML, 2008.\n\n[21] M. Schlesinger. Sintaksicheskiy analiz dvumernykh zritelnikh singnalov v usloviyakh pomekh (syntactic\n\nanalysis of two-dimensional visual signals in noisy conditions). Kibernetika, 4:113\u2013130, 1976.\n\n[22] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother.\nA comparative study of energy minimization methods for Markov random \ufb01elds with smoothness-based\npriors. PAMI, 2008.\n\n[23] O. Veksler. Graph cut based optimization for MRFs with truncated convex priors. In CVPR, 2007.\n[24] M. Wainwright, T. Jaakkola, and A. Willsky. MAP estimation via agreement on trees: Message passing\n\nand linear programming. IEEE Trans. on Information Theory, 51(11):3697\u20133717, 2005.\n\n[25] Y. Weiss, C. Yanover, and T. Meltzer. MAP estimation, linear programming and belief propagation with\n\nconvex free energies. In UAI, 2007.\n\n8\n\n\f", "award": [], "sourceid": 451, "authors": [{"given_name": "Philip", "family_name": "Torr", "institution": null}, {"given_name": "M.", "family_name": "Kumar", "institution": null}]}