{"title": "Analysis of Contour Motions", "book": "Advances in Neural Information Processing Systems", "page_first": 913, "page_last": 920, "abstract": "", "full_text": "Analysis of Contour Motions\n\nCe Liu William T. Freeman Edward H. Adelson\nComputer Science and Arti\ufb01cial Intelligence Laboratory\n\nMassachusetts Institute of Technology\n\nCambridge, MA 02139, USA\n\n{celiu,billf,adelson}@csail.mit.edu\n\nAbstract\n\nA reliable motion estimation algorithm must function under a wide range of con-\nditions. One regime, which we consider here, is the case of moving objects with\ncontours but no visible texture. Tracking distinctive features such as corners can\ndisambiguate the motion of contours, but spurious features such as T-junctions\ncan be badly misleading. It is dif\ufb01cult to determine the reliability of motion from\nlocal measurements, since a full rank covariance matrix can result from both real\nand spurious features. We propose a novel approach that avoids these points al-\ntogether, and derives global motion estimates by utilizing information from three\nlevels of contour analysis: edgelets, boundary fragments and contours. Boundary\nfragment are chains of orientated edgelets, for which we derive motion estimates\nfrom local evidence. The uncertainties of the local estimates are disambiguated\nafter the boundary fragments are properly grouped into contours. The grouping\nis done by constructing a graphical model and marginalizing it using importance\nsampling. We propose two equivalent representations in this graphical model, re-\nversible switch variables attached to the ends of fragments and fragment chains,\nto capture both local and global statistics of boundaries. Our system is success-\nfully applied to both synthetic and real video sequences containing high-contrast\nboundaries and textureless regions. The system produces good motion estimates\nalong with properly grouped and completed contours.\n\n1 Introduction\n\nHumans can reliably analyze visual motion under a diverse set of conditions, including textured\nas well as featureless objects. Computer vision algorithms have focussed on conditions of texture,\nwhere junction or corner-like image structures are assumed to be reliable features for tracking [5, 4,\n17]. But under other conditions, these features can generate spurious motions. T-junctions caused by\nocclusion can move in an image very differently than either of the objects involved in the occlusion\nevent [11]. To properly analyze motions of featureless objects requires a different approach.\n\nThe spurious matching of T-junctions has been explained in [18] and [9]. We brie\ufb02y restate it using\nthe simple two bar stimulus in Figure 1 (from [18]). The gray bar is moving rightward in front of\nthe leftward moving black bar, (a). If we analyze the motion locally, i.e. match to the next frame\nin a local circular window, the \ufb02ow vectors of the corner and line points are as displayed in Figure\n1 (b). The T-junctions located at the intersections of the two bars move downwards, but there is no\nsuch a motion by the depicted objects.\n\nOne approach to handling the spurious motions of corners or T-junctions has been to detect such\njunctions and remove them from the motion analysis [18, 12]. However, T-junctions are often very\ndif\ufb01cult to detect in a static image from local, bottom-up information [9]. Motion at occluding\nboundaries has been studied, for example in [1]. The boundary motion is typically analyzed locally,\n\n\f(a)\n\n(b)\n\n(c)\n\nFigure 1: Illustration of the spurious T-junction motion. (a) The front gray bar is moving to the right and the\nblack bar behind is moving to the left [18]. (b) Based on a local window matching, the eight corners of the bars\nshow the correct motion, whereas the T-junctions show spurious downwards motion. (c) Using the boundary-\nbased representation our system is able to correctly estimate the motion and generate the illusory boundary as\nwell.\n\nwhich can again lead to spurious junction trackings. We are not aware of an existing algorithm that\ncan properly analyze the motions of featureless objects.\n\nIn this paper, we use a boundary-based approach which does not rely on motion estimates at corners\nor junctions. We develop a graphical model which integrates local information and assigns proba-\nbilities to candidate contour groupings in order to favor motion interpretations corresponding to the\nmotions of the underlying objects. Boundary completion and discounting the motions of spurious\nfeatures result from optimizing the graphical model states to explain the contours and their motions.\nOur system is able to automatically detect and group the boundary fragments, analyze the motion\ncorrectly, as well as exploit both static and dynamic cues to synthesize the illusory boundaries (c).\n\nWe represent the boundaries at three levels of grouping: edgelets, boundary fragments and contours,\nwhere a fragment is a chain of edgelets and a contour is a chain of fragments. Each edgelet within\na boundary fragment has a position and an orientation and carries local evidence for motion. The\nmain task of our model is then to group the boundary fragments into contours so that the local motion\nuncertainties associated with the edgelets are disambiguated and occlusion or other spurious feature\nevents are properly explained. The result is a specialized motion tracking algorithm that properly\nanalyzes the motions of textureless objects.\n\nOur system consists of four conceptual steps, discussed over the next three sections (the last two\nsteps happen together while \ufb01nding the optimal states in the graphical model):\n\n(a) Boundary fragment extraction: Boundary fragments are detected in the \ufb01rst frame.\n(b) Edgelet tracking with uncertainties: Boundary fragments are broken into edgelets, and,\nbased on local evidence, the probability distribution is found for the motion of each edgelet\nof each boundary fragment.\n\n(c) Grouping boundary fragments into contours: Boundary fragments are grouped, using\n\nboth temporal and spatial cues.\n\n(d) Motion estimation: The \ufb01nal fragment groupings disambiguate motion uncertainties and\n\nspecify the \ufb01nal inferred motions.\n\nWe restrict the problem to two-frame motion analysis though the algorithm can easily be extended\nto multiple frames.\n\n2 Boundary Fragment Extraction\n\nExtracting boundaries from images is a nontrivial task by itself. We use a simple algorithm for\nboundary extraction, analyzing oriented energy using steerable \ufb01lters [3] and tracking the boundary\nin a manner similar to that of the Canny edge detector [2]. A more sophisticated boundary detector\ncan be found in [8]; occluding boundaries can also be detected using special cameras [13]. However,\nfor our motion algorithm designed to handle the special case of textureless objects, we \ufb01nd that our\nsimple boundary detection algorithm works well.\nMathematically, given an image I, we seek to obtain a set of fragments B = {b i}, where each\nfragment bi is a chain of edgelets bi ={eik}ni\nk=1. Each edgelet eik ={pik, \u03b8ik} is a particle which\nembeds both location pik \u2208 R\n\n2 and orientation \u03b8ik \u2208[0, 2\u03c0) information.\n\n\f(a)\n\n(b)\n\n(c)\n\n(d)\n\nFigure 2: The local motion vector is estimated for each contour in isolation by selectively comparing orienta-\ntion energies across frames. (a) A T-junction of the two bar example showing the contour orientation for this\nmotion analysis. (b) The other frame. (c) The relevant orientation energy along the boundary fragment, both for\nthe 2nd frame. A Gaussian pdf is \ufb01t to estimate \ufb02ow, weighted by the oriented energy. (d) Visualization of the\nGaussian pdf. The possible contour motions are unaffected by the occluding contour at a different orientation\nand no spurious motion is detected at this junction.\n\nWe use H4 and G4 steerable \ufb01lters [3] to \ufb01lter the image and obtain orientation energy per pixel.\nThese \ufb01lters are selected because they describe the orientation energies well even at corners. For\neach pixel we \ufb01nd the maximum energy orientation and check if it is local maximum within a slice\nperpendicular to this orientation. If that is true and the maximum energy is above a threshold T 1 we\ncall this point a primary boundary point. We collect a pool of primary boundary points after running\nthis test for all the pixels.\n\nWe \ufb01nd the primary boundary point with the maximum orientation energy from the pool and do\nbidirectional contour tracking, consisting of prediction and projection steps. In the prediction step,\nthe current edgelet generates a new one by following its orientation with a certain step size. In\nthe projection step, the orientation is locally maximized both in the orientation bands and within a\nsmall spatial window. The tracking is stopped if the energy is below a threshold T 2 or if the turning\nangle is above a threshold. The primary boundary points that are close to the tracked trajectory are\nremoved from the pool. This process is repeated until the pool is empty. The two thresholds T 1 and\nT2 play the same roles as those in Canny edge detection [2]. While the boundary tracker should\nstop at sharp corners, it can turn around and continue tracking. We run a postprocess to break the\nboundaries by detecting points of curvature local maxima which exceed a curvature threshold.\n\n3 Edgelet Tracking with Uncertainties\n\nWe next break the boundary contours into very short edgelets and obtain the probabilities, based\non local motion of the boundary fragment, for the motion vector at each edgelet. We cannot use\nconventional algorithms, such as Lucas-Kanade [5], for local motion estimation since they rely on\ncorners. The orientation \u03b8ik for each edgelet was obtained during boundary fragment extraction.\nWe obtain the motion vector by \ufb01nding the spatial offsets of the edgelet which match the orientation\nenergy along the boundary fragment in this orientation. We \ufb01t a Gaussian distribution N (\u03bc ik, \u03a3ik)\nof the \ufb02ow weighted by the orientation energy in the window. The mean and covariance matrix is\nadded to the edgelet: eik ={pik, \u03b8ik, \u03bcik, \u03a3ik}. This procedure is illustrated in Figure 2.\nGrouping the boundary fragments allows the motion uncertainties to be resolved. We next discuss\nthe mathematical model of grouping as well as the computational approach.\n\n4 Boundary Fragment Grouping and Motion Estimation\n4.1 Two Equivalent Representations for Fragment Grouping\n\nThe essential part of our model is to \ufb01nd the connection between the boundary fragments. There are\ntwo possible representations for grouping. One representation is the connection of each end of the\nboundary fragment. We formulate the probability of this connection to model the local saliency of\ncontours. The other equivalent representation is a chain of fragments that forms a contour, on which\nglobal statistics are formulated, e.g. structural saliency [16]. Similar local and global modeling of\ncontour saliency was proposed in [14]; in [7], both edge saliency and curvilinear continuity were\nused to extract closed contours from static images. In [15], contour ends are grouped using loopy\nbelief propagation to interpret contours.\n\nThe connections between fragment ends are modeled by switch variables. For each boundary frag-\nment bi, we use a binary variable {0, 1} to denote the two ends of the fragment, i.e. b (0)\ni = ei1 and\nto b(tj )\ni = ei,ni. Let switch variable S(i, ti) = (j, tj) denote the connection from b (ti)\nb(1)\n. This\n\nj\n\ni\n\n\f3b\n\n3b\n\n3b\n\n3b\n\n3b\n\n2b\n\n1b\n\n(a)\n\n2b\n\n1b\n\n)0(\n1b\n(b)\n\n2b\n\n1b\n\n(c)\n\n2b\n\n1b\n\n(d)\n\n2b\n\n1b\n\n(e)\n\nFigure 3: A simple example illustrating switch variables, reversibility and fragment chains. The color arrows\nshow the switch variables. The empty circle indicates end 0 and the \ufb01lled indicates end 1. (a) Shows three\nboundary fragments. Theoretically b(0)\ncan connect to any of the other ends including itself, (b). However, the\nswitch variable is exclusive, i.e. there is only one connection to b(0)\nconnects to\nb(0)\n3 , then b(0)\n1 , as shown in (c). Figures (d) and (e) show two of the legal contour\ngroupings for the boundary fragments: two open contours and a closed loop contour.\n\n1 , and reversible, i.e. if b(0)\n\nshould also connect to b(0)\n\n1\n\n3\n\n1\n\nconnection is exclusive, i.e. each end of the fragment should either connect to one end of the other\nfragment, or simply have no connection. An exclusive switch is further called reversible, i.e.\n\nif S(i, ti) = (j, tj), then S(j, tj) = (i, ti),\n\nor in a more compact form\n\nS(S(i, ti)) = (i, ti).\n(1)\n, we simply set S(i, ti) = (i, ti). We use the binary function\nWhen there is no connection to b(ti)\n\u03b4[S(i, ti)\u2212(j, tj)] to indicate whether there is a connection between b (ti)\n. The set of all\nthe switches are denoted as S ={S(i, ti)|i = 1 : N, ti = 0, 1}. We say S is reversible if every switch\nvariable satis\ufb01es Eqn. (1). The reversibility of switch variables is shown in Figure 3 (b) and (c).\n\nand b(tj )\n\nj\n\ni\n\ni\n\ni1\n\ni1\n\nim\n\n, b(x1)\n\n, b(xm)\n\n),\u00b7\u00b7\u00b7 , (b(xm)\n\nthe switch variables we can obtain contours, which are chains of\nFrom the values of\nA fragment chain is de\ufb01ned as a series of the end points c =\nboundary fragments.\n{(b(x1)\n)}. The chain is speci\ufb01ed by fragment label {i 1,\u00b7\u00b7\u00b7 , im} and\nend label {x1,\u00b7\u00b7\u00b7 , xm}. It can be either open or closed. The order of the chain is determined by the\nswitch variable. Each end appears in the chain at most once. The notation of a chain is not unique.\nTwo open chains are identical if the fragment and end labels are reversed. Two closed chains are\nidentical if they match each other by rotating one of them. These identities are guaranteed from the\nreversibility of the switch variables. A set of chains C = {c i} can be uniquely extracted based on\nthe values of the reversible switch variables, as illustrated in Figure 3 (d) and (e).\n\nim\n\n4.2 The Graphical Model\nGiven the observation O, the two images, and the boundary fragments B, we want to estimate the\n\ufb02ow vectors V ={vi} and vi ={vik}, where each vik associates with edgelet eik, and the grouping\nvariables S (switches) or equivalently C (fragment chains). Since the grouping variable S plays an\nessential role in the problem, we shall \ufb01rst infer S and then infer V based on S.\n\n4.2.1 The Graph for Boundary Fragment Grouping\n\nWe use two equivalent representations for boundary grouping, switch variables and chains. We use\n\u03b4[S(S(i, ti)) \u2212 (i, ti)] for each end to enforce the reversibility. Suppose otherwise S(i 1, ti1) =\n(cid:3)= i2. Let S(j, tj) = (i1, ti1) without loss of generality, then\nS(i2, ti2) = (j, tj) for i1\n\u03b4[S(S(i2, ti2)) \u2212 (i2, ti2)]=0, which means that the switch variables are not reversible.\nWe use a function \u03bb(S(i, ti); B, O) to measure the distribution of S(i, ti), i.e. how likely b(ti)\nconnects to the end of other fragments. Intuitively, two ends should be connected if\n\ni\n\n(cid:4) Motion similarity the distributions of the motion of the two end edgelets are similar;\n(cid:4) Curve smoothness the illusory boundary to connect the two ends is smooth;\n(cid:4) Contrast consistency the brightness contrast at the two ends consistent with each other.\nWe write \u03bb(\u00b7) as a product of three terms, one enforcing each criterion. We shall follow the example\nin Figure 4 to simplify the notation, where the task is to compute \u03bb(S(1, 0)= (2, 0)). The \ufb01rst term\n\n\f2b\n\n)0(\n\n2b\n\n)0(\n\n1b\n\n(a)\n\n1b\n\n1b\n\n(\n\n\u03a3\u03bc\n\n,\n\n11\n\n11\n\n(b)\n\n(\n\n)\n\n\u03a3\u03bc\n\n,\n\n21\n\n)\n\n21\n\n2b\n\n2b\n\n1b\n\nr\n\n(c)\n\n11h\n12h\n\n1b\n\n(d)\n\n21h\n22h\n\n2b\n\nand b(0)\n\nFigure 4: An illustration of local saliency computation. (a) Without loss of generalization we assume the two\nends to be b(0)\n2 . (b) The KL divergence between the distributions of \ufb02ow vectors are used to measure\nthe motion similarity. (c) An illusory boundary \u03b3 is generated by minimizing the energy of the curve. The sum\nof square curvatures are used to measure the curve smoothness. (d) The means of the local patches located at\nthe two ends are extracted, i.e. h11 and h12 from b(0)\n2 , to compute contrast consistency.\n\n1 , h21 and h22 from b(0)\n\n1\n\nis the KL divergence between the two Gaussian distributions of the \ufb02ow vectors\n\nexp{\u2212\u03b1KLKL(N (\u03bc11, \u03a311),N (\u03bc21, \u03a321))},\n\n(2)\nwhere \u03b1KL is a scaling factor. The second term is the local saliency measure on the illusory bound-\nary \u03b3 that connects the two ends. The illusory boundary is simply generated by minimizing the\nenergy of the curve. The saliency is de\ufb01ned as\n\u2212\u03b1\u03b3\n\n(cid:5)2\n\n(cid:2)\n\n(cid:6)\n\n(cid:4)\n\nexp\n\n(cid:3)\n\nds\n\n,\n\n(3)\n\nd\u03b8\nds\n\n\u03b3\n\nwhere \u03b8(s) is the slope along the curve, and d\u03b8\nthird term is computed by extracting the mean of local patches located at the two ends\n\nds is local curvature [16]. \u03b1\u03b3 is a scaling factor. The\n\nmin\n\nmax\n\n2\u03c32\n\n(4)\nwhere d1 = (h11 \u2212 h21)2, d2 = (h12 \u2212 h22)2, and dmax = max(d1, d2), dmin = min(d1, d2).\n\u03c3max > \u03c3min are the scale parameters. h11, h12, h21, h22 are the means of the pixel values of\nthe four patches located at the two end points. For self connection we simply set a constant value:\n\u03bb(S(i, ti)=(i, ti))= \u03c4.\nWe use a function \u03c8(ci; B, O) to model the structural saliency of contours. It was discovered in [10]\nthat convex occluding contours are more salient, and additional T-junctions along the contour may\nincrease or decrease the occlusion perception. Here we simply enforce that a contour should have\nno self-intersection. \u03c8(ci; B, O)=1 if there is no self intersection and \u03c8(ci; B, O)=0 otherwise.\nThus, the (discrete) graphical model favoring the desired fragment grouping is\n\nPr(S; B, O) =\n\n1\nZS\n\nN(cid:9)\n\n1(cid:9)\n\n\u03bb(S(i, ti); B, O)\u03b4[S(S(i, ti)) \u2212 (i, ti)] \u00b7 M(cid:9)\n\ni=1\n\nti=0\n\nj=1\n\n\u03c8(cj; B, O),\n\n(5)\n\nwhere ZS is a normalization constant. Note that this model measures both the switch variables\nS(i, ti) for local saliency and the fragment chains c i to enforce global structural saliency.\n\n(cid:7)\n\u2212 dmax\n\nexp\n\n(cid:8)\n\n,\n\n\u2212 dmin\n2\u03c32\n\nni(cid:9)\n\nik (vik \u2212 \u03bcik)} ni\u22121(cid:9)\n\n4.2.2 Gaussian MRF on Flow Vectors\nGiven the fragment grouping, we model the \ufb02ow vectors V as a Gaussian Markov random \ufb01eld\n(GMRF). The edgelet displacement within each boundary fragment should be smooth and match\nthe observation along the fragment. The probability density is formulated as\n\n\u03d5(vi; bi) =\n\nexp{\u2212(vik \u2212 \u03bcik)T \u03a3\u22121\n\nexp{\u2212 1\n2\u03c32\n\n(cid:5)vik \u2212 vi,k+1(cid:5)2},\n\n(6)\n\nk=1\n\nk=1\n\nwhere \u03bcik and \u03a3ik are the motion parameters of each edgelet estimated in Sect 3.\nWe use V(i, ti) to denote the \ufb02ow vector of end t i of fragment bi. We de\ufb01ne V(S(i, ti))=V(j, tj)\nif S(i, ti) = (j, tj). Intuitively the \ufb02ow vectors of the two ends should be similar if they are con-\nnected, or mathematically\n\n(cid:2)\n\n1\nexp{\u2212 1\n2\u03c32\n\n(cid:5)V(i, ti)\u2212V(S(i, ti))(cid:5)2}\n\nifS(i, ti) = (i, ti),\notherwise.\n\n(7)\n\n\u03c6(V(i, ti), V(S(i, ti))) =\n\n\fThe (continuous) graphical model of the \ufb02ow vectors is therefore de\ufb01ned as\n\nN(cid:9)\n\n1(cid:9)\n\nPr(V|S; B) =\n\n1\nZV\n\ni=1\n\nti=0\n\n\u03d5(vi; bi)\n\n\u03c6(V(i, ti), V(S(i, ti)))\n\n(8)\n\nwhere ZV is a normalization constant. When S is given it is a GMRF which can be solved by least\nsquares.\n\n4.3 Inference\n\nHaving de\ufb01ned the graphical model to favor the desired motion and grouping interpretations, we\nneed to \ufb01nd the state parameters that best explain the image observations. The natural decomposition\nof S and V in our graphical model\n\nPr(V, S; B, O)=Pr(S; B, O) \u00b7 Pr(V|S; B, O),\n\n(9)\n(where Pr(S; B, O) and Pr(V|S; B, O) are de\ufb01ned in Eqn. (5) and (8) respectively) lends itself\nto performing two-step inference. We \ufb01rst infer the boundary grouping B, and then infer V based\non B. The second step is simply to solve least square problem since Pr(V|S; B, O) is a GMRF.\nThis approach does not globally optimize Eqn. (9) but results in reasonable solution because V\nstrongly depends on S. The density function Pr(S; B, O) is not a random \ufb01eld, so we use importance\nsampling [6] to obtain the marginal distribution Pr(S(i, t i); B, O). The proposal density of each\nswitch variable is set to be\n\nq (S(i, ti)=(j, tj)) \u221d 1\nZq\n\n\u03bb (S(i, ti)=(j, tj)) \u03bb (S(j, tj)=(i, ti))\n\n(10)\nwhere \u03bb(\u00b7) has been normalized to sum to 1 for each end. We found that this bidirectional measure\nis crucial to take valid samples. To sample the proposal density, we \ufb01rst randomly select a boundary\nfragment, and connect to other fragments based on q(S(i, t i)) to form a contour (a chain of boundary\nfragments). Each end is sampled only once, to ensure reversibility. This procedure is repeated\nuntil no fragment is left. In the importance step we run the binary function \u03c8(c i) to check that\nIf \u03c8(c i) = 0 then this sample is rejected. The marginal\neach contour has no self-intersection.\ndistributions are estimated from the samples. Lastly the optimal grouping is obtained by replacing\nrandom sampling with selecting the maximum-probability connection over the estimated marginal\ndistributions. The number of samples needed depends on the number of the fragments. In practice\nwe \ufb01nd that n2 samples are suf\ufb01cient for n fragments.\n\n5 Experimental Results\n\nFigure 6 shows the boundary extraction, grouping, and motion estimation results of our system for\nboth real and synthetic examples1. All the results are generated using the same parameter settings.\nThe algorithm is implemented in MATLAB, and the running time varies from ten seconds to a few\nminutes, depending on the number of the boundary fragments found in the image.\n\nThe two-bar examples in Figure 1(a) yields fourteen detected boundary fragments in Figure 6(a)\nand two contours in (b). The estimated motion matches the ground truth at the T-junctions. The\nfragments belonging to the same contour are plotted in the same color and the illusory boundaries\nare synthesized as shown in (c). The boundaries are warped according to the estimated \ufb02ow and\ndisplayed in (d). The hallucinated illusory boundaries in frame 1 (c) and 2 (d) are plausible amodal\ncompletions.\n\nThe second example is the Kanizsa square where the frontal white square moves to the right bottom.\nTwelve fragments are detected in (a) and \ufb01ve contours are grouped in (b). The estimated motion and\ngenerated illusory boundary also match the ground truth and human perception. Notice that the arcs\ntend to connect to other ones if we do not impose the structural saliency \u03c8(\u00b7).\nWe apply our system to a video of a dancer (Figure 5 (a) and (b)). In this stimulus the right leg\nmoves downwards, but there is weak occluding boundary at the intersection of the legs. Eleven\n\n1The results can be viewed online http://people.csail.mit.edu/celiu/contourmotions/\n\n\f(a) Dancer frame 1\n\n(b) Dancer frame 2\n\n(c) Chair frame 1\n\n(d) Chair frame 2\n\nFigure 5: Input images for the non-synthetic examples of Figure 6. The dancer\u2019s right leg is moving down-\nwards and the chair is rotating (note the changing space between the chair\u2019s arms).\n\nboundary fragments are extracted in (a) and \ufb01ve contours are extracted in (b). The estimated motion\n(b) matches the ground truth. The hallucinated illusory boundary in (c) and (d) correctly connect the\noccluded boundary of the right leg and the invisible boundary of the left leg.\n\nThe \ufb01nal row shows challenging images of a rotating chair (Figure 5 (c) and (d)), also showing\nproper contour completion and motion analysis. Thirty-seven boundary fragments are extracted\nand seven contours are grouped. To complete the occluded contours of this image would be nearly\nimpossible working only from a static image. Exploiting motion as well as static information, our\nsystem is able to complete the contours properly.\n\nNote that the traditional motion analysis algorithms fail at estimating motion for these examples\n(see supplementary videos) and would thus also fail at correctly grouping the objects based on the\nmotion cues.\n\n6 Conclusion\n\nWe propose a novel boundary-based representation to estimate motion under the challenging vi-\nsual conditions of moving textureless objects. Ambiguous local motion measurements are resolved\nthrough a graphical model relating edgelets, boundary fragments, completed contours, and their\nmotions. Contours are grouped and their motions analyzed simultaneously, leading to the correct\nhandling of otherwise spurious occlusion and T-junction features. The motion cues help the contour\ncompletion task, allowing completion of contours that would be dif\ufb01cult or impossible using only\nlow-level information in a static image. A motion analysis algorithm such as this one that correctly\nhandles featureless contour motions is an essential element in a visual system\u2019s toolbox of motion\nanalysis methods.\n\nReferences\n\n[1] M. J. Black and D. J. Fleet. Probabilistic detection and tracking of motion boundaries.\n\nJournal of Computer Vision, 38(3):231\u2013245, 2000.\n\nInternational\n\n[2] J. Canny. A computational approach to edge detection. IEEE Trans. Pat. Anal. Mach. Intel., 8(6):679\u2013698,\n\nNov 1986.\n\n[3] W. T. Freeman and E. H. Adelson. The design and use of steerable \ufb01lters. IEEE Trans. Pat. Anal. Mach.\n\nIntel., 13(9):891\u2013906, Sep 1991.\n\n[4] B. K. P. Horn and B. G. Schunck. Determing optical \ufb02ow. Arti\ufb01cial Intelligence, 17:185\u2013203, 1981.\n[5] B. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In\n\nProceedings of the International Joint Conference on Arti\ufb01cial Intelligence, pages 674\u2013679, 1981.\n\n[6] D. Mackay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003.\n[7] S. Mahamud, L. Williams, K. Thornber, and K. Xu. Segmentation of multiple salient closed contours\n\nfrom real images. IEEE Trans. Pat. Anal. Mach. Intel., 25(4):433\u2013444, 2003.\n\n[8] D. Martin, C. Fowlkes, and J. Malik. Learning to detect natural image boundaries using local brightness,\n\ncolor, and texture cues. IEEE Trans. Pat. Anal. Mach. Intel., 26(5):530\u2013549, May 2004.\n\n[9] J. McDermott. Psychophysics with junctions in real images. Perception, 33:1101\u20131127, 2004.\n[10] J. McDermott and E. H. Adelson. The geometry of the occluding contour and its effect on motion inter-\n\npretation. Journal of Vision, 4(10):944\u2013954, 2004.\n\n[11] J. McDermott and E. H. Adelson. Junctions and cost functions in motion interpretation. Journal of Vision,\n\n4(7):552\u2013563, 2004.\n\n\f(a) Extracted boundaries\n\n(b) Estimated \ufb02ow\n\n(c) Frame 1\n\n(d) Frame 2\n\nFigure 6: Experimental results for some synthetic and real examples. The same parameter settings were used\nfor all examples. Column (a): Boundary fragments are extracted using our boundary tracker. The red dots are\nthe edgelets and the green ones are the boundary fragment ends. Column (b): Boundary fragments are grouped\ninto contours and the \ufb02ow vectors are estimated. Each contour is shown in its own color. Columns (c): the\nillusory boundaries are generated for the \ufb01rst and second frames. The gap between the fragments belonging to\nthe same contour are linked exploiting both static and motion cues in Eq. (5).\n\n[12] S. J. Nowlan and T. J. Sejnowski. A selection model for motion processing in area mt primates. The\n\nJournal of Neuroscience, 15(2):1195\u20131214, 1995.\n\n[13] R. Raskar, K.-H. Tan, R. Feris, J. Yu, and M. Turk. Non-photorealistic camera: depth edge detection and\n\nstylized rendering using multi-\ufb02ash imaging. ACM Trans. Graph. (SIGGRAPH), 23(3):679\u2013688, 2004.\n\n[14] X. Ren, C. Fowlkes, and J. Malik. Scale-invariant contour completion using conditional random \ufb01elds.\n\nIn Proceedings of International Conference on Computer Vision, pages 1214\u20131221, 2005.\n\n[15] E. Saund. Logic and MRF circuitry for labeling occluding and thinline visual contours. In Advances in\n\nNeural Information Processing Systems 18, pages 1153\u20131160, 2006.\n\n[16] A. Shahua and S. Ullman. Structural saliency: the detection of globally salient structures using a locally\nIn Proceedings of International Conference on Computer Vision, pages 321\u2013327,\n\nconnected network.\n1988.\n\n[17] J. Shi and C. Tomasi. Good features to track.\n\nRecognition, pages 593\u2013600, 1994.\n\nIn IEEE Conference on Computer Vision and Pattern\n\n[18] Y. Weiss and E. H. Adelson. Perceptually organized EM: A framework for motion segmentaiton that\n\ncombines information about form and motion. Technical Report 315, M.I.T Media Lab, 1995.\n\n\f", "award": [], "sourceid": 3040, "authors": [{"given_name": "Ce", "family_name": "Liu", "institution": null}, {"given_name": "William", "family_name": "Freeman", "institution": null}, {"given_name": "Edward", "family_name": "Adelson", "institution": null}]}