{"title": "Online Clustering of Moving Hyperplanes", "book": "Advances in Neural Information Processing Systems", "page_first": 1433, "page_last": 1440, "abstract": null, "full_text": "Online Clustering of Moving Hyperplanes\n\n Rene Vidal Center for Imaging Science, Department of Biomedical Engineering, Johns Hopkins University 308B Clark Hall, 3400 N. Charles St., Baltimore, MD 21218, USA rvidal@cis.jhu.edu\n\nAbstract\nWe propose a recursive algorithm for clustering trajectories lying in multiple moving hyperplanes. Starting from a given or random initial condition, we use normalized gradient descent to update the coefficients of a time varying polynomial whose degree is the number of hyperplanes and whose derivatives at a trajectory give an estimate of the vector normal to the hyperplane containing that trajectory. As time proceeds, the estimates of the hyperplane normals are shown to track their true values in a stable fashion. The segmentation of the trajectories is then obtained by clustering their associated normal vectors. The final result is a simple recursive algorithm for segmenting a variable number of moving hyperplanes. We test our algorithm on the segmentation of dynamic scenes containing rigid motions and dynamic textures, e.g., a bird floating on water. Our method not only segments the bird motion from the surrounding water motion, but also determines patterns of motion in the scene (e.g., periodic motion) directly from the temporal evolution of the estimated polynomial coefficients. Our experiments also show that our method can deal with appearing and disappearing motions in the scene.\n\n1 Introduction\nPrincipal Component Analysis (PCA) [1] refers to the problem of fitting a linear subspace S RD of unknown dimension d < D to N sample points X = {xi S }N 1 . A natural extension i= of PCA is subspace clustering, which refers to the problem of fitting a union of n 1 linear n subspaces {Sj RD }j =1 of unknown dimensions dj = dim(Sj ), 0 < dj < D, to N points DN X = {xi R }i=1 drawn from n=1 Sj , without knowing which points belong to which subspace. j This problem shows up in a variety of applications in computer vision (image compression, motion segmentation, dynamic texture segmentation) and also in control (hybrid system identification). Subspace clustering has been an active topic of research over the past few years. Existing methods randomly choose a basis for each subspace, and then iterate between data segmentation and standard PCA. This can be done using methods such as Ksubspaces [2], an extension of Kmeans to the case of subspaces, or Expectation Maximization for Mixtures of Probabilistic PCAs [3]. An alternative algebraic approach, which does not require any initialization, is Generalized PCA (GPCA) [4]. In GPCA the data points are first projected onto a low-dimensional subspace. Then, a set of polynomials is fitted to the projected data points and a basis for each one of the projected subspaces is obtained from the derivatives of these polynomials at the data points. Unfortunately, all existing subspace clustering methods are batch, i.e. the subspace bases and the segmentation of the data are obtained after all the data points have been collected. In addition, existing methods are designed for clustering data lying in a collection of static subspaces, i.e. the subspace bases do not change as a function of time. Therefore, when these methods are applied to time-series data, e.g., dynamic texture segmentation, one typically applies them to a moving time window, under the assumption that the subspaces are static within that window. A major disadvantage of this approach is that it does not incorporate temporal coherence, because the segmentation\n\n\f\nand the bases at time t + 1 are obtained independently from those at time t. Also, this approach is computationally expensive, since a new subspace clustering problem is solved at each time instant. In this paper, we propose a computationally simple and temporally coherent online algorithm for clustering point trajectories lying in a variable number of moving hyperplanes. We model a union of n moving hyperplanes in RD , Sj (t) = {x RD : b (t)x = 0}, j = 1, . . . , n, where b(t) RD , j as the zero set of a polynomial with time varying coefficients. Starting from an initial polynomial at time t, we compute an update of the polynomial coefficients using normalized gradient descent. The hyperplane normals are then estimated from the derivatives of the new polynomial at each trajectory. The segmentation of the trajectories is obtained by clustering their associated normal vectors. As time proceeds, new data are added, and the estimates of the polynomial coefficients are more accurate, because they are based on more observations. This not only makes the segmentation of the data more accurate, but also allows us to handle a variable number of hyperplanes. We test our approach on the challenging problem of segmenting dynamic textures from rigid motions in video.\n\n2 Recursive estimation of a single hyperplane\nIn this section, we review the normalized gradient algorithm for estimating a single hyperplane. We consider both static and moving hyperplanes, and analyze the stability of the algorithm in each case. Recursive linear regression. For the sake of simplicity, let us first revisit a simple linear regression problem in which we are given measurements {x(t), y (t)} related by the equation y (t) = b x(t). t ^ At time t, we seek an estimate b(t) of b that minimizes f (b) = =1 (y ( ) - b x( ))2 . A simple ^ strategy is to recursively update b(t) by following the negative of the gradient direction at time t, ^ v (t) = -(b(t) x(t) - y (t))x(t). (1 ) However, it is better to normalize this gradient in order to achieve better convergence properties. As shown in Theorem 2.8, page 77 of [5], the following normalized gradient recursive identifier ^ (b(t) x(t) - y (t)) ^ ^ b(t + 1) = b(t) - x(t), (2 ) 1 + x(t) 2 ^ where > 0 is a fixed parameter, is such that b(t) b exponentially if the regressors {x(t)} are persistently exciting, i.e. if there is an S N and 1 , 2 > 0 such that for all m m +S t 1 ID x(t)x(t) 2 ID , (3 )\n=m\n\nwhere A B means that (B - A) is positive definite and ID is the identity matrix in RD . Intuitively, the condition on the left hand side of (3) means that the data has to be persistently \"rich enough\" in time in order to uniquely estimate the vector b, while the condition on the right hand side is needed for stability purposes, as it imposes a uniform upper bound on the covariance of the data. Consider now a modification of the linear regression problem in which the parameter vector varies with time, i.e. y (t) = b (t)x(t). As shown in [6], if the regressors {x(t)} are persistently exciting and the sequence {b(t + 1) - b(t)} is L2 -stable, i.e. sup b(t+1)-b(t) 2 < , then the normalized\nt1\n\n^ ^ gradient recursive identifier (2) produces an estimate b(t) of b(t) such that {b(t) - b(t)} is L2 -stable.\n\nRecursive hyperplane estimation. Let {x(t)} be a set of measurements lying in the moving hyper^ plane S (t) = {x RD : b (t)x = 0}. At time t, we seek an estimate b(t) of b(t) that minimizes t 2 the error f (b(t)) = =1 (b ( )x( )) subject to the constraint b(t) = 1. Notice that the main difference between linear regression and hyperplane estimation is that in the latter case the parameter vector b(t) is constrained to lie in the unit sphere SD-1 . Therefore, instead of applying standard gradient descent as in (2), we must follow the negative gradient direction along the geodesic curve ^ in SD-1 passing through b(t). As shown in [7], the geodesic curve passing through b SD-1 along the tangent vector v T SD-1 is b cos( v ) + v sin( v ). Therefore, the update equation for the v normalized gradient recursive identifier on the sphere is v (t) ^ ^ b(t + 1) = b(t) cos( v(t) ) + sin( v(t) ), (4 ) v(t)\n\n\f\nwhere the negative normalized gradient is computed as\n (^ ^(t)b (t) b (t)x(t))x(t) . ^ v (t) = - D - b 1 + x(t) 2\n\nI\n\n(5 )\n\nNotice that the gradient on the sphere is essentially the same as the Euclidean gradient, except that it ^ ^ needs to be projected onto the subspace orthogonal to ^(t) by the matrix ID - b(t)b (t) RD(D-1) . b Another difference between recursive linear regression and recursive hyperplane estimation is that the persistence of excitation condition (3) needs to be modified to 1 ID-1 \nm +S t =m Pb(t) x(t)x(t) Pb(t) 2 ID-1 ,\n\n(6 )\n\nwhere the projection matrix Pb(t) R(D-1)D onto the orthogonal complement of b(t) accounts for the fact that b(t) = 1. Under persistence of excitation condition (6), if b(t) = b the identifier ^ ^ (4) is such that b(t) b exponentially, while if {b(t + 1) - b(t)} is L2 -stable, so is {b(t) - b(t)}.\n\n3 Recursive segmentation of a known number of moving hyperplanes\nIn this section, we generalize the recursive identifier (4) and its stability properties to the case of N trajectories {xi (t)}N 1 lying in n hyperplanes {Sj (t)}n=1 . In principle, we could apply the i= j identifier (2) to each one of the hyperplanes. However, as we do not know the segmentation of the data, we do not know which data to use to update each one of the n identifiers. In the approach, the n hyperplanes are represented with a single polynomial whose coefficients do not depend on the segmentation of the data. By updating the coefficients of this polynomial, we can simultaneously estimate all the hyperplanes, without first clustering the point trajectories. Representing moving hyperplanes with a time varying polynomial. Let x(t) be an arbitrary point in one of the n hyperplanes. Then there is a vector bj (t) normal to Sj (t) such that b (t)x(t) = 0. j Thus, the following homogeneous polynomial of degree n in D variables must vanish at x(t): = b b b 0. (7 ) (t)x(t) pn (x(t), t) = 1 (t)x(t) n 2 (t)x(t) This homogeneous polynomial can be written as a linear combination of all the monomials of degree n in x, xI = xn1 xn2 xnD with 0 nk n for k = 1, . . . , D, and n1 + n2 + + nD = n, as 1 2 D c . n1 nD pn (x, t) = (8 ) n1 ,...,nD (t)x1 xD = c(t) n (x) = 0, where cI (t) R represents the coefficient of the monomial xI . The map n : RD RMn (D) is known as the Veronese map of degree n, which is defined as [8]: (9 ) n+D-1 i where I is chosen in the degree-lexicographic order and Mn (D) = s the total number of n independent monomials. Notice that since the normal vectors {bj (t)} are time dependent, the vector of coefficients c(t) is also time dependent. Since both the normal vectors and the coefficient vector are defined up to scale, we will assume that bj (t) = c(t) = 1, without loss of generality. Recursive identification of the polynomial coefficients. Thanks to the polynomial equation (8), we now propose a new online hyperplane clustering algorithm that operates on the polynomial coefficients c(t), rather than on the normal vectors {bj (t)}n 1 . The advantage of doing so is that c(t) i= does not depend on which hyperplane the measurement x(t) belongs to. Our method operates as c follows. At each time t, we seek to find an estimate ^(t) of c(t) that minimizes f (c(t)) =\nt N 1 i (c( ) n (xi ( )))2 . N =1 =1\n\nn : [x1 , . . . , xD ] [. . . , xI , . . .] ,\n\n(1 0 )\n\nBy using normalized gradient descent on SMn (D)-1 , we obtain the following recursive identifier ^ ^ c(t + 1) = c(t) cos( v(t) ) + v (t) sin( v(t) ), v(t) (1 1 )\n\n\f\nwhere the negative normalized gradient is computed as v (t) = - I\nMn (D)\n\n- ^(t)c (t) c^\n\nNotice that (11) reduces to (4) and (12) reduces to (5) if n = 1 and N = 1.\n\nN 1 (c (t)n (xi (t)))n (xi (t))/N i= ^ . N 1 + i=1 n (xi (t)) 2/N\n\n(1 2 )\n\nRecursive identification of the hyperplane normals. Given an estimate of c(t), we may obtain an estimate of the vector normal to the hyperplane containing a trajectory x(t) from the derivative of ^ the polynomial pn (x, t) = c (t)n (x) at x(t) as ^\n ^ Dn (x(t))c(t) ^ b(x(t)) = (x(t))c(t) , ^ Dn\n\n(1 3 )\n\nwhere Dn (x) is the Jacobian of n at x. We choose the derivative of pn to estimate the normal ^ vector bj (t), because if x(t) is a trajectory in the j th hyperplane, then b (t)x(t) = 0, hence the j derivative of the true polynomial pn at the trajectory gives Dpn (x(t), t) = pn (x(t), t) k = x(t)\nn\n\n=1 =k\n\n\n\n(b (t)x(t))bk (t) bj (t). \n\n(1 4 )\n\nStability of the recursive identifier. Since in practice we do not know the true polynomial coeffi^ ^ cients c(t), and we estimate b(t) from ^(t), we need to show that both c(t) and b(x(t)) track their c true values in a stable fashion. Theorem 1 shows that this is the case. Notice that the persistence of excitation condition for multiple hyperplanes (15) is essentially the same as the one for a single hyperplane (6), but properly modified to take into account that the regressors are a set of trajectories in the embedded space {n (xi (t))}N 1 , rather than a single trajectory in the original space {x(t)}. i= Theorem 1 Let Pc(t) R(Mn (D)-1)Mn (D) be a projection matrix onto the orthogonal complement of c(t). Consider the recursive identifier (11)(13) and assume that the embedded regressors {n (xi (t))}N 1 are persistently exciting, i.e. there exist 1 , 2 > 0 and S N such that for all m i= 1 IMn (D)-1 \nm +S i N t =m =1 Pc(t) n (xi (t))n (xi (t))Pc(t) 2 IMn (D)-1 .\n\n(1 5 )\n\n^ Then the sequence c(t) - c(t) is L2 -stable. Furthermore, if a trajectory x(t) belongs to the j th ^ hyperplane, then the corresponding b(x(t)) in (13) is such that bj (t) - ^(x(t)) is L2 -stable. If in b ^ ^ addition the hyperplanes are static, then c(t) - c(t) 0 and bj (t) - b(x(t)) 0 exponentially. Proof. [Sketch only] When the hyperplanes are static, the exponential convergence of c(t) to c follows with minor modifications from Theorem 2.8, page 77 of [5]. This implies that , > 0 such that c(t) - c < -t . Also, since the vectors b1 , . . . , bn are different, the polynomial ^ c n (x) has no repeated factor. Therefore, there is a > 0 and a T > 0 such that for all t > T we ^ have Dn (x(t)) c and Dn (x(t)) c(t) (see proof of Theorem 3 in [9] for the latter claim). Combining this with c c + c - c and c = 1, we obtain that when x(t) Sj , ^ ^\n, , D (x(t))^(t) D (x(t))c - D (x(t))c D (x(t))^(t) , c, c , n n n n bj - ^(x(t)) = , b , , , Dn (x(t))^(t) Dn (xt )c c , , , , , c c , (Dn (x(t))(^(t) - c) Dn (x(t))c - Dn (x(t))c Dn (x(t))(^(t) - c)), 2 Dn (x(t))(^(t) - c) Dn (x(t))c c Dn (x(t)) 2 ^(t) - c) c 2 E 2 -t 2 2 = 2 n n2 , 2 2 \n\n^ showing that b(x(t))bj exponentially. In the last step we used the fact that for all x RD there is a constant matrix of exponents Ekn RMn (D)Mn-1 (D) such that n (x)/ xk = Ekn n-1 (x). n Therefore, Dn (x) En n-1 (x) = En n-1 n (x) n En , where En = max( Ekn ) and n = 2(nn 1) 2 . Consider now the case in which the hyperplanes are moving. Since SD-1 - is compact, the sequences {bj (t + 1) - bj (t)}n=1 are trivially L2 -stable, hence so is the sequence j ^ ^ {c(t + 1) - c(t)}. The L2 -stability of {c(t) - c(t)} and {bj (t) - b(t)} follows.\n\n\f\nSegmentation of the point trajectories. Theorem 1 provides us with a method for computing an ^ estimate b(xi (t)) for the normal to the hyperplane passing through each one of the N trajectories {xi (t) RD }N 1 at each time instant. The next step is to cluster these normals into n groups, i= thereby segmenting the N trajectories. We do so by using a recursive version of the K-means algorithm, adapted to vectors on the unit sphere. Essentially, at each t, we seek the normal vectors ^ bj (t) SD-1 and the membership of wij (t) {0, 1} of trajectory i to hyperplane j that maximize iN j n ^ ^ ^ wij (t)(bj (t)b(xi (t)))2 . (1 6 ) f ({wij (t)}, {bj (t)}) =\n=1 =1\n\nThe main difference with K-means is that we maximize the dot product of each data point with the cluster center, rather than minimizing the distance. Therefore, the cluster center is given by the principal component of each group, rather than the mean. In order to obtain temporally coherent estimates of the normal vectors, we use the estimates at time t to initialize the iterations at time t + 1. Algorithm 1 (Recursive hyperplane segmentation)\nInitialization step ^ 1: Randomly choose {bj (1)}n=1 and ^(1), or else apply the GPCA algorithm to {xi (1)}N 1 . c j i= For each t 1 c 1: Update the coefficients of the polynomial pn (x(t), t) = ^(t) n (x(t)) using the recursive procedure ^ ^ c(t + 1) = ^(t) cos( v(t) ) + c v (t ) sin( v(t) ), v(t) P ` N 1 (c (t)n (xi (t)))n (xi (t))/N i= ^ ^^ v ( t ) = - I M n (D ) - c ( t ) c ( t ) . P 1 + N 1 n (xi (t)) 2/N i=\n ^ D n ( x i ( t ) )c ( t ) ^ b (x i (t )) = (x (t ) )c (t ) ^ Dn i\n\n2: Solve for the normal vectors from the derivatives of pn at the given trajectories ^ i = 1, . . . , N .\n\n3: Segment the normal vectors using the K-means algorithm on the sphere 8 <1 if j = arg max (b (t)b(xi (t)))2 ^k ^ k=1,...,n , i = 1, . . . , N , j = 1, . . . , n (a) Set wij (t) = :0 otherwise ^ ~ ^ ^ ^ ^ (b) Set bj (t) = P C A( w1j (t)b(x1 (t)) w2j (t)b(x2 (t)) wN j (t)b(xN (t)) ), j = 1, . . . , n ^ ^ (c) Iterate (a) and (b) until convergence of wij (t), and then set bj (t + 1) = bj (t).\n\n4 Recursive segmentation of a variable number of moving hyperplanes\nIn the previous section, we proposed a recursive algorithm for segmenting n moving hyperplanes under the assumption that n is known and constant in time. However, in many practical situations the number of hyperplanes may be unknown and time varying. For example, the number of moving objects in a video sequence may change due to objects entering or leaving the camera field of view. In this section, we consider the problem of segmenting a variable number of moving hyperplanes. We denote by n(t) N the number of hyperplanes at time t and assume we are given an upper bound n n(t). We show that if we apply Algorithm 1 with the number of hyperplanes set to n, then we can still recover the correct segmentation of the scene, even if n(t) < n. To see this, let us have a close look at the persistence of excitation condition in equation (15) of Theorem 1. Since the condition on the right hand side of (15) holds trivially when the regressors xi (t) are bounded, the only important condition is the one on the left hand side. Notice that the condition on the left hand side implies that the spatial-temporal covariance matrix of the embedded regressors must be of rank Mn (D) - 1 in any time window of size S for some integer S . Loosely speaking, the embedded regressors must be \"rich enough\" either in space or in time. The case in which there is a 1 > 0 such that for all t iN Pc(t) n (xi (t))n (xi (t))Pc(t) 1 IMn (D)-1 n(t) = n an d\n=1\n\n(1 7 )\n\n\f\ncorresponds to the case of data that is rich in space. In this case, at each time instant we draw data from all n hyperplanes and the data is rich enough to estimate all n hyperplanes at each time instant. In fact, condition (17) is the one required by GPCA [4], which in this case can be applied at each time t independently. Notice also that (17) is equivalent to (15) with S = 1. The case in which n(t) = 1 and there are 1 > 0, S N and i {1, . . . , N } such that for all m\nm +S t =m n (xi (t))n (xi (t)) \n\n1 IM (D)-1 Nn\n\n(1 8 )\n\ncorresponds to the case of data that is rich in time. In this case, at each time instant we draw data from a single hyperplane. As time proceeds, however, the data must be persistently drawn from at least n hyperplanes in order for (18) to hold. This can be achieved either by having n different static hyperplanes and persistently drawing data from all of them, or by having less than n moving hyperplanes whose motion is rich enough so that (18) holds. In summary, as long as the embedded regressors satisfy condition (15) for some upper bound n on the number of hyperplanes, the recursive identifier (11)-(13) will still provide L2 -stable estimates of the parameters, even if the number of hyperplanes is unknown and variable, and n(t) < n for all t.\n\n5 Experiments\nExperiments on synthetic data. We randomly draw N = 200 3D points lying in n = 2 planes and apply a time varying rotation to these points for t = 1, . . . , 1000 to generate N trajectories {xi (t)}N 1 . Since the true segmentation is known, we compute the vectors {bj (t)} normal to each i= plane, and use them to generate the vector of coefficients c(t). We run our algorithm on the sogenerated data with n = 2, = 1 and a random initial estimate for the parameters. We compare these estimates with the ground truth using the percentage of misclassified points. We also consider the error of the polynomial coefficients and the normal vectors by computing the angles between the estimated and true values. Figure 1 shows the true and estimated parameters, as well as the estimation errors. Observe that the algorithm takes about 100 seconds for the errors to stabilize within 1.62 for the coefficients, 1.62 for the normals, and 4% for the segmentation error.\nTrue polynomial coefficients 1 0.5 0 -0.5 -1 0 1 0.5 0 -0.5 -1 0 Estimated polynomial coefficients\n\nEstimation error of the polynomial (degrees)\n\n60\n\n40\n\n20\n\n200\n\n400 600 Time (seconds)\n\n800\n\n1000\n\n200\n\n400 600 Time (seconds)\n\n800\n\n1000\n\n0 0\n\n200\n\n400 600 Time (seconds)\n\n800\n\n1000 Segmentation error (%) 50\n\nTrue normal vector b1 1 0.5 0 -0.5 -1 0 1\n\nEstimated normal vector b1\n\nEstimation error of b1 and b2 (degrees) 40 b1 b2\n\n0.5\n\n40 30 20 10\n\n30\n0 -0.5 -1 0\n\n20 10 0 1000 0\n\n200\n\n400 600 Time (seconds)\n\n800\n\n1000\n\n200\n\n400 600 Time (seconds)\n\n800\n\n200\n\n400 600 Time (seconds)\n\n800\n\n0 1000 0\n\n200\n\n400 600 Time (seconds)\n\n800\n\n1000\n\nFigure 1: Segmenting 200 points lying on two moving planes in R3 using our recursive algorithm. Segmentation of dynamic textures. We now apply our algorithm to the problem of segmenting video sequences of dynamic textures, i.e. sequences of nonrigid scenes that exhibit some temporal stationarity, e.g., water, smoke, or foliage. As proposed in [10], one can model the temporal evolution of the image intensities as the output of a linear dynamical system. Since the trajectories of the output of a linear dynamical system live in the so-called observability subspace, the intensity trajectories of pixels associated with a single dynamic texture lie in a subspace. Therefore, the set of all intensity trajectories lie in multiple subspaces, one per dynamic texture.\n\n\f\nt Given consecutive frames of a video sequence {I (f )}f =t- +1 , we interpret the data as a matrix W (t) RN 3 , where N is the number of pixels, and 3 corresponds to the three RGB color channels. We obtain a data point xi (t) RD from image I (t) by projecting the ith row of W (t), w (t) onto a subspace of dimension D, i.e. xi (t) = wi (t), with RD3 . The projection i matrix can be obtained in a variety of ways. We use the D principal components of the first frames to define . More specifically, if W ( ) = U V , with U RN D , RDD and V R3 D is a rank-D approximation of W ( ) computed using SVD, then we choose = -1 V .\n\nWe applied our method to a sequence (110 192, 130 frames) containing a bird floating on water, while rotating around a fix point. The task is to segment the bird's rigid motion from the water's dynamic texture, while at the same time tracking the motion of the bird. We chose D = 5 principal components of the = 5 first frames of the RGB video sequence to project each frame onto a lower dimensional space. Figure 2 shows the segmentation. Although the convergence is not guaranteed with only 130 frames, it is clear that the polynomial coefficients already capture the periodicity of the motion. As shown in the last row of Figure 2, some coefficients of the polynomial oscillate in time. One can notice that the orientation of the bird is related to the value of the coefficient c8 . If the bird is facing to the right showing her right side, the value of c8 achieves a local maximum. On the contrary if the bird is oriented to the left, the value of c8 achieves a local minimum. Some irregularities seem to appear at the local minima of this coefficient: they actually correspond to a rapid motion of the bird. One can distinguish three behaviors for the polynomial coefficients: oscillations, pseudooscillations or quasi-linearity. For both the oscillations and the pseudo-oscillations the period is identical to the bird's motion period (40 frames). This example shows that the coefficients of the estimated polynomial give useful information about the scene motion.\n\n-0.02\n\n-0.02\n\n-0.02\n\n-0.02\n\n-0.02\n\n-0.03\n\n-0.03\n\n-0.03\n\n-0.03\n\n-0.03\n\n-0.04\n\n-0.04\n\n-0.04\n\n-0.04\n\n-0.04\n\n0\n\n50 Time (seconds)\n\n100\n\n0\n\n50 Time (seconds)\n\n100\n\n0\n\n50 Time (seconds)\n\n100\n\n0\n\n50 Time (seconds)\n\n100\n\n0\n\n50 Time (seconds)\n\n100\n\nFigure 2: Segmenting a bird floating on water. Top: frames 17, 36, 60, 81, and 98 of the sequence. Middle: segmentation obtained using our method. Bottom: temporal evolution of c8 during the video sequence, with the red dot indicating the location of the corresponding frame in this evolution. To test the performance of our method on a video sequence with a variable number of motions, we extracted a sub-clip of the bird sequence (55 192, 130 frames) in which the camera moves up at 1 pixel/frame until the bird disappears at t = 51. The camera stays stationary from t = 56 to t = 66, and then moves down at 1 pixel/frame, the bird reappears at t = 76. We applied both GPCA and our method initialized with GPCA to this video sequence. For GPCA we used a moving window of = 5 frames. For our method we chose D = 5 principal components of the = 5 first frames of the RGB video sequence to project each frame onto a fixed lower dimensional space. We set the parameter of the recursive algorithm to = 1. Figure 3 shows the segmentation results. Notice that both methods give excellent results during the first few frames, when both the bird and the water are present. This is expected, as our method is initialized with GPCA. Nevertheless, notice that the performance of GPCA deteriorates dramatically when the bird disappears, because GPCA overestimates the number of hyperplanes, whereas our method is robust to this change and keeps segmenting the scene correctly, i.e. assigning all the pixels to the background. When the bird reappears, our method detects the bird correctly from the first frame whereas GPCA produces\n\n\f\na wrong segmentation for the first frames after the bird reappears. Towards the end of the sequence, both algorithms give a good segmentation. This demonstrates that our method has the ability to deal with a variable number of motions, while GPCA has not. In addition the fixed projection and the recursive estimation of the polynomial coefficients make our method much faster than GPCA.\n\nSequence\n\nGPCA\n\nOur method\n\nFigure 3: Segmenting a video sequence with a variable number of dynamic textures. Top: frames 1, 24, 65, 77, and 101. Middle: segmentation with GPCA. Bottom: segmentation with our method.\n\n6 Conclusions\nWe have proposed a simple recursive algorithm for segmenting trajectories lying in a variable number of moving hyperplanes. The algorithm updates the coefficients of a polynomial whose derivatives give the normals to the moving hyperplanes as well as the segmentation of the trajectories. We applied our method successfully to the segmentation of videos containing multiple dynamic textures.\n\nAcknowledgments\nThe author acknowledges the support of grants NSF CAREER IIS-04-47739, NSF EHS-05-09101 and ONR N00014-05-10836.\n\nReferences\n[1] I. Jolliffe. Principal Component Analysis. Springer-Verlag, New York, 1986. [2] J. Ho, M.-H. Yang, J. Lim, K.-C. Lee, and D. Kriegman. Clustering apperances of objects under varying illumination conditions. In IEEE Conference on Computer Vision and Pattern Recognition, volume 1, pages 1118, 2003. [3] M. Tipping and C. Bishop. Mixtures of probabilistic principal component analyzers. Neural Computation, 11(2):443482, 1999. [4] R. Vidal, Y. Ma, and S. Sastry. Generalized Principal Component Analysis (GPCA). IEEE Trans. on Pattern Analysis and Machine Intelligence, 27(12):115, 2005. [5] B.D.O. Anderson, R.R. Bitmead, C.R. Johnson Jr., P.V. Kokotovic, R.L. Ikosut, I.M.Y. Mareels, L. Praly, and B.D. Riedle. Stability of Adaptive Systems. MIT Press, 1986. [6] L. Guo. Stability of recursive stochastic tracking algorithms. In IEEE Conf. on Decision & Control, pages 20622067, 1993. [7] A. Edelman, T. Arias, and S. T. Smith. The geometry of algorithms with orthogonality constraints. SIAM Journal of Matrix Analysis Applications, 20(2):303353, 1998. [8] J. Harris. Algebraic Geometry: A First Course. Springer-Verlag, 1992. [9] R. Vidal and B.D.O. Anderson. Recursive identification of switched ARX hybrid models: Exponential convergence and persistence of excitation. In IEEE Conf. on Decision & Control, pages 3237, 2004. [10] G. Doretto, A. Chiuso, Y. Wu, and S. Soatto. Dynamic textures. International Journal of Computer Vision, 51(2):91109, 2003.\n\n\f\n", "award": [], "sourceid": 2977, "authors": [{"given_name": "Ren\u00e9", "family_name": "Vidal", "institution": null}]}