{"title": "Location Estimation with a Differential Update Network", "book": "Advances in Neural Information Processing Systems", "page_first": 1073, "page_last": 1080, "abstract": null, "full_text": "Location Estimation with a Differential Update\n\nNetwork\n\nAli Rahimi and Trevor Darrell\nArti\ufb01cial Intelligence Laboratory\n\nMassachusetts Institute of Technology\n\nCambridge, MA 02139\n\nfali,trevorg@mit.edu\n\nAbstract\n\nGiven a set of hidden variables with an a-priori Markov structure, we\nderive an online algorithm which approximately updates the posterior as\npairwise measurements between the hidden variables become available.\nThe update is performed using Assumed Density Filtering: to incorporate\neach pairwise measurement, we compute the optimal Markov structure\nwhich represents the true posterior and use it as a prior for incorporating\nthe next measurement. We demonstrate the resulting algorithm by cal-\nculating globally consistent trajectories of a robot as it navigates along a\n2D trajectory. To update a trajectory of length t, the update takes O(t).\nWhen all conditional distributions are linear-Gaussian, the algorithm can\nbe thought of as a Kalman Filter which simpli\ufb01es the state covariance\nmatrix after incorporating each measurement.\n\n1 Introduction\n\nConsider a hidden Markov chain. Given a sequence of pairwise measurements between the\nelements of the chain (for example, their differences, corrupted by noise) we are asked to\nre\ufb01ne our estimate of their values online, as these pairwise measurements become available.\nWe propose the Differential Update Network as a mechanism for solving this problem. We\nuse this mechanism to recover the trajectory of a robot given noisy measurements of its\nmovement between points in its trajecotry. These pairwise displacements are thought of as\nnoise corrupted measurements between the true but unknown poses to be recovered. The\nrecovered trajectories are consistent in the sense that when the camera returns to an already\nvisited position, its estimated pose is consistent with the pose recovered on the earlier visit.\n\nPose change measurements between two points on the trajectory are obtained by bringing\nimages of the environment acquired at each pose into registration with each other. The\nrequired transformation to affect the registration is the pose change measurement. There\nis a rich literature on computing pose changes from a pair of scans from an optical sensor:\n2D [5, 6] and 3D transformations [7, 8, 9] from monocular cameras, or 3D transformations\nfrom range imagery [10, 11, 12] are a few examples. These have been used by [1, 2] in 3D\nmodel acquisition and by [3, 4] in robot navigation. The trajectory of the robot is de\ufb01ned as\nthe unknown pose from which each frame was acquired, and is maintained in a state vector\nwhich is updated as pose changes are measured.\n\n\fFigure 1: Independence structure of a differential update network.\n\nAn alternative method estimates the pose of the robot with respect to \ufb01xed features in the\nworld. These methods represent the world as a set of features, such as corners, lines, and\nother geometric shapes in 3D [13, 14, 15] and match features between a scan at the current\npose and the acquired world representation. However, measurements are still pairwise,\nsince they depend on a feature and the poses of the camera. Because both the feature list\nand the poses are maintained in the state vector, the differential Update Framework can be\napplied to both scan-based methods and feature-based methods.\n\nOur algorithm incorporates each pose change measurement by updating the pose associated\nwith every frame encountered. To insure that each update can happen in time linear to the\nlength of the trajectory, the correlation structure of the state vector is approximated with\na simpler Markov chain after each measurement. This scheme can be thought of as an\ninstance of Assumed Density Filtering (ADF) [16, 17].\n\nThe Differential Update Network presented here assumes a linear Gaussian system, but\nour derivation is general and can accommodate any distribution. For example, we are\ncurrently experimenting with discrete distributions. In addition, we focus on frame-based\ntrajectory estimation due to the ready availability of pose change estimators, and to avoid\nthe complexity of maintaining an explicit feature map.\n\nThe following section describes the model in a Bayesian framework. Sections 3 and 4\nsketch existing batch and online methods for obtaining globally consistent trajectories. Sec-\ntion 5 derives the update rules for our algorithm, which is then applied to a 2D trajectory\nestimation in section 6.\n\n2 Dynamics and Measurement Models\n\nFigure 1 depicts the network. We assume the hidden variables xt have a Markov structure\nwith known transition densities:\n\np(X) =\n\nT\n\nYt=1\n\np(xtjxt(cid:0)1):\n\nPairwise measurements appear on the chain one by one. Conditioned on the hidden vari-\nables, these measurements are assumed to be independent:\n\np(Y jX) = Y(s;t)2M\n\np(yt\n\nsjxs; xt);\n\nwhere M is the set of pairs of hidden variables which have been measured.\nTo apply this network to robot localization, let X = fxtgt=1::T be the trajectory of the\nrobot up to time T , with each xt denoting its pose at time t. These poses can be repre-\nsented using any parametrization of pose, for example as 3D rotations and translation, 2D\n\n\ftranslations (which is what we use in section 6, or even non-rigid deformations such as\naf\ufb01ne. The conditional distribution between adjacent x\u2019s is assumed to follow:\n\np(xt+1jxt) = N (xt+1jxt; (cid:3)xjx):\n\n(1)\nAs the robot moves, the pose change estimator computes the motion y t\ns of the robot from\ntwo scans of the environment. Given the true poses, we assume that these measurements\nare independent of each other even when they share a common scan. We model each y t\ns as\nbeing drawn from a Gaussian centered around xt (cid:0) xs:\n\np(yt\n\nsjxt (cid:0) xs; (cid:3)yjxx)\n\nsjxs; xt) = N (yt\n\n(2)\ns in Y be-\nThe online global estimation problem requires us to update p(XjY ) as each y t\ncomes available. The following section reviews a batch solution for computing p(XjY )\nusing this model. Section 4 discusses a recursive approach with a similar running time as\nthe batch version. Section 5 presents our approach, which performs these updates much\nfaster by simplifying the output of the recursive solution after incorporating each measure-\nment.\n\n3 Batch Linear Gaussian Solution\n\nEquation (1) dictates a Gaussian prior p(X) with mean mX and covariance (cid:3)X. Because\nthe pose dynamics are Markovian, the inverse covariance (cid:3)(cid:0)1\nX is tri-diagonal. According\nto equation (2), the observations are drawn from yt\ns = As;tX + !s;t = xt (cid:0) xs + !s;t,\nwith !s;t white and Gaussian with covariance (cid:21)s;t. Stacking up the As;t and (cid:21)s;t into A\nand (cid:3)Y jX respectively we know that the posterior mean of XjY is [21]:\n\nmXjY = mX + (cid:3)X A>(cid:0)A(cid:3)X A> + (cid:3)Y jX(cid:1)(cid:0)1\n(cid:3)XjY = (cid:3)X (cid:0) (cid:3)X A>(cid:0)A(cid:3)X A> + (cid:3)Y jX(cid:1)(cid:0)1\n\nY\n\nA(cid:3)X ;\n\nor alternatively,\n\n(cid:3)(cid:0)1\nXjY = (cid:3)(cid:0)1\nX + (cid:3)(cid:0)1\nY jX\nmXjY = (cid:3)XjY (cid:16)(cid:3)(cid:0)1\nX mX + (cid:3)(cid:0)1\n\nY jX Y(cid:17) :\n\n(3)\n\n(4)\n\n(5)\n\n(6)\n\nIf there are M measurements and T hidden variables, this computation will take O(T 2M )\nif performed naively. Note that if M > T , as is the case in the robot mapping problem, the\nalternate equations (5) and (6) can be used to obtain a running time of O(T 3).\n\n4 Online Linear Gaussian Solution\n\nLu and Milios [3] proposed a recursive update for updating the trajectory XjY old after\nobtaining a new measurement yt\ns. Because each measurement is independent of past mea-\nsurements given the X\u2019s, the update is:\n\np(XjY old; yt\ns)\n\n/ p(yt\n\nsjX)p(XjY old):\n\nBayes\n\nUsing equations (3) and (4) to perform this update for one y t\nM measurements, this yields the same \ufb01nal cost as the batch update.\nOne way to lower this cost is to reduce the number of hidden variables xt by \ufb01xing some\nof them, thus reducing T [23]. It is also possible to take advantage of the sparseness of the\ncovariance structure of XjY old by using the updates (6) and (5):\n\n(7)\ns takes O(T 2). After integrating\n\nXjnewmXjnew = (cid:16)(cid:3)(cid:0)1\n(cid:3)(cid:0)1\n(cid:3)(cid:0)1\nXjnew = (cid:3)(cid:0)1\n\nXjoldmXjold + (cid:21)yt\nXjold + A>\n\ns;t(cid:21)(cid:0)1\n\nXjoldAs;t\n\ns(cid:17)\nsjoldyt\n\n(8)\n\n(9)\n\n\fFigure 2: The measurement (left) correlates the hidden variables (middle), whose correlation is then\nsimpli\ufb01ed (right), and is ready to accept a new measurement.\n\nXjnew has a sparse structure (see equation (9)), mXjnew can be found using a\n\nBecause (cid:3)(cid:0)1\nsparse linear system solver [23]. Unfortunately, as measurements are incorporated, (cid:3)(cid:0)1\nXjnew\nbecomes denser due to the accumulation of the rank 1 terms in equation (9), rendering this\napproach less effective.\n\nIn the linear Gaussian case, the Differential Update Network addresses this problem by pro-\njecting (cid:3)Xjnew on the closest covariance matrix which has a tri-diagonal inverse. Hence,\nin solving (8), (cid:3)Xjnew is always tri-diagonal, so mXjnew is easy to compute.\n\n5 Approximate Online Solution\n\nTo implement this idea in the general case, we resort to Assumed Density Filtering (ADF)\n[16]: we approximate p(XjY old) with a simpler distribution q(XjY old). To incorporate a\nnew measurement yt\n\ns, we apply the update\n\np(XjY new)\n\nBayes\n\n/ p(yt\n\nsjxs; xt)q(XjY old):\n\n(10)\n\nThis new p(XjY new) has a more complicated independence structure than q(XjY old), so\nincorporating subsequent measurements would require more work and the resulting poste-\nrior would be even hairier. So we approximate it again with a q(XjY new) that has a simpler\nindependence structure. Subsequent measurements can again be incorporated easily using\nthis new q. Speci\ufb01cally, we force q to always obey Markovian independence. Figure 5\nsummarizes this process.\n\nThe following section discusses how to \ufb01nd a Markovian q so as to minimize the KL diver-\ngence between p and q. Section 5.2 shows how to incorporate a pairwise measurement on\nthe resulting Markov chain using equation (10).\n\n5.1 Simplifying the independence structure\n\nWe would like to approximate an arbitrary distribution which factors according to p(X) =\n\nQt pt(xtjPa[xt]); using one which factors into q(X) = Qt qt(xtjQa[xt]): Here, Pa[xt]\n\nare the parents of node xt in the graph prescribed by p(X), and Qa[xt][xt] = xt(cid:0)1 are the\nparents of node xt as prescribed by q(X).\nThe objective is to minimize:\n\nq(cid:3) = arg min\n\nq\n\nKL(cid:18)Y pt(cid:13)(cid:13)(cid:13)(cid:13)\n\nY qt(cid:19) = Zx\n\np(X) ln\n\np(X)\n\nQi qt(xtjQa[xt])\n\n:\n\n(11)\n\nAfter some manipulation, it can be shown that:\n\nq(cid:3)\nt = p(xtjQa[xt]):\n\n(12)\n\nThis says that the best conditional qt is built up from the corresponding pt by marginalizing\nout the conditions that were removed in the graph. This is not an easy operation to perform\nin general, but the following section shows how to do it in our case.\n\n\f5.2 Computing posterior transitions on a graph with a single loop\n\nThis result suggests a simpli\ufb01cation to the update of equation (10). Because the ultimate\ngoal is to compute q(XjY new), not p(XjY new), we only need to compute the posterior\ntransitions p(xtjxt(cid:0)1; Y new). Thus, we circumvent having to \ufb01rst \ufb01nd p then project it\nonto q. We propose computing these transitions in three steps, one for the transitions to the\nleft of xs, another for the loop, and the third for transitions to the right of xt.\n\n5.2.1 Finding p(x(cid:28) jx(cid:28) (cid:0)1; y) for (cid:28) = s::t\n\nFor every s < (cid:28) < t, notice that\n\np(y; x(cid:28) (cid:0)1; xt)p(x(cid:28) jx(cid:28) (cid:0)1; xt) = p(y; x(cid:28) (cid:0)1; x(cid:28) ; xt);\n\n(13)\nbecause according to \ufb01gure 5, p(x(cid:28) jx(cid:28) (cid:0)1; xt) = p(x(cid:28) jx(cid:28) (cid:0)1; xt; y). If we could \ufb01nd this\njoint distribution for all (cid:28), we could \ufb01nd p(x(cid:28) jx(cid:28) (cid:0)1; y) by marginalizing out xt and normal-\nizing. We could also \ufb01nd p(x(cid:28) jy) by marginalizing out both xt and x(cid:28) (cid:0)1, then normalizing.\nFinally, we could compute p(y; x(cid:28) ; xt) for the next (cid:28) in the iteration.\nSo there are two missing pieces: The \ufb01rst is p(y; xs; xt) for starting the recursion. Com-\nputing this term is easy, because p(yjxs; xt) is the given measurement model, and p(xs; xt)\ncan be obtained easily from the prior by successively applying the total probability theorem.\nThe second missing piece is p(x(cid:28) jx(cid:28) (cid:0)1; xt). Note that this quantity does not depend on the\nmeasurements and could be computed of\ufb02ine if we wanted to. The recursion for calculating\nit is:\n\np(x(cid:28) jx(cid:28) (cid:0)1; xt)\n\nBayes\n\n/\n\np(xtjx(cid:28) )p(x(cid:28) jx(cid:28) (cid:0)1)\n\np(xtjx(cid:28) )\n\n=\n\nZ dxi+1 p(xtjxi+1)p(x(cid:28) +1jx(cid:28) )\n\n(14)\n\n(15)\n\nIt\nThe second equation describes a recursion which starts from t and goes down to s.\ncomputes the in\ufb02uence of node (cid:28) on node t. Equation (14) is coupled to this equation\nand uses its output. It involves applying Bayes rule to compute a function of 3 variables.\nBecause of the backward nature of (15), p(x(cid:28) jx(cid:28) (cid:0)1; xt) has to be computed using a pass\nwhich runs in the opposite direction of the process of (13).\n\n5.2.2 Finding p(x(cid:28) jx(cid:28) (cid:0)1; y) for (cid:28) = 1::s\n\nStarting from (cid:28) = s (cid:0) 1, compute\n\np(yjx(cid:28) )\n\n=\n\nZ dx(cid:28) +1 p(yjx(cid:28) +1)p(x(cid:28) +1jx(cid:28) )\n\np(x(cid:28) jy)\n\np(x(cid:28) jx(cid:28) (cid:0)1; y)\n\nBayes\n\n/\n\nBayes\n\n/\n\np(yjx(cid:28) )p(x(cid:28) )\n\np(yjx(cid:28) )p(x(cid:28) jx(cid:28) (cid:0)1)\n\nThe recursion \ufb01rst computes the in\ufb02uence of x(cid:28) on the observation, then computes the\nmarginal and the transition probability.\n\n5.2.3 Finding p(x(cid:28) jx(cid:28) (cid:0)1; y) for (cid:28) = t::T\n\nStarting from (cid:28) = t, compute\n\np(x(cid:28) jy) = Z dx(cid:28) (cid:0)1 p(x(cid:28) jx(cid:28) (cid:0)1; y)p(x(cid:28) (cid:0)1jy)\n\np(x(cid:28) jx(cid:28) (cid:0)1; y) = p(x(cid:28) jx(cid:28) (cid:0)1)\n\nThe second identity follows from the independence structure on the right side of observed\nnodes.\n\n\f6 Results\n\nWe manually navigated a camera rig along two trajectories. The camera faced upward and\nrecorded the ceiling. The robot took about 3 minutes to trace each path, producing about\n6000 frames of data for each experiment. The trajectory was pre-marked on the \ufb02oor so\nwe could revisit speci\ufb01c locations (see the rightmost diagrams of \ufb01gures 6(a,b)). This was\ndone to make the evaluation of the results simpler. The trajectory estimation worked at\nframe-rate, although it was processed of\ufb02ine to simplify data acquisition.\n\nIn these experiments, the pose parameters were (x; y) locations on the \ufb02oor. All experi-\nments assume the same Brownian motion dynamics. For each new frame, pose changes\nwere computed with respect to at most three base frames. The selection of base frames was\nbased on a measure of appearance between the current frame and all past frames. The pose\nchange estimator was a Lucas-Kanade optical \ufb02ow tracker [24]. To compute pose displace-\nments, we computed a robust average of the \ufb02ow vectors using an iterative outlier rejection\nscheme. We used the number of inlier \ufb02ow vectors as a crude estimate of the precision of\np(yt\nFigures 6(a,b) compare the algorithm presented in this paper against two others. The middle\nplots compare our algorithm (blue) against the batch algorithm which uses equations (5)\nand (6) (black). Although our recovered trajectories don\u2019t coincide exactly with the batch\nsolutions, like the batch solutions, ours are smooth and consistent.\n\nsjxs; xt).\n\nIn contrast, more naive methods of reconstructing trajectories do not exhibit these two\ndesiderata. Estimating the motion of each frame with respect to only the previous base\nframe yields an unsmooth trajectory (green). Furthermore, loops can\u2019t be closed correctly\n(for example, the robot is not found to return to the origin).\n\nThe simplest method of taking into account multiple base frames also fails to meet our re-\nquirements. The red trajectory shows what happens when we assume individual poses are\nindependent. This corresponds to using a diagonal matrix to represent the correlation be-\ntween the poses (instead of the tri-diagonal inverse covariance matrix our algorithm uses).\nNotice that the resulting trajectory is not smooth, and loops are not well closed.\n\nBy taking into account a minimum amount of correlation between frame poses, loops have\nbeen closed correctly and the trajectory is correctly found to be smooth.\n\n7 Conclusion\n\nWe have presented a method for approximately computing the posterior distribution of a\nset of variables for which only pairwise measurements are available. We call the resulting\nstructure a Differential Update Network and showed how to use Assumed Density Filtering\nto update the posterior as pairwise measurements become available. The two key insights\nwere 1) how to approximate the posterior at each step to minimize KL divergence, and 2)\nhow to compute transition densities on a graph with a single loop in closed form.\n\nWe showed how to estimate globally consistent trajectories for a camera using this frame-\nwork. In this linear-Gaussian context, our algorithm can be thought of as a Kalman Filter\nwhich projects the state information matrix down to a tri-diagonal representation while\nminimizing the KL divergence between the truth and obtain estimate. Although the exam-\nple used pose change measurements between scans of the environment, our framework can\nbe applied to feature-based mapping and localization as well.\n\nReferences\n\n[1] A. Stoddart and A. Hilton. Registration of multiple point sets. In IJCV, pages B40\u201344, 1996.\n\n\f(a)\n\n(b)\n\nFigure 3: Left, naive accumulation (green) and projecting trajectory to diagonal covariance (red).\nLoops are not closed well, and trajectory is not smooth. The zoomed areas show that in both naive\napproaches, there are large jumps in the trajectory, and the pose estimate is incorrect at revisited\nlocations. Right, Differential Update Network (blue) and exact solution (black). Like the batch\nsolution, our solution generates smooth and consistent trajectories.\n\n\f[2] Y. Chen and G. Medioni. Object modelling by registration of multiple range images. InPorceed-\nings of the IEEE Internation Conference on Robotics and Authomation, pages 2724\u20132728, 1991.\n[3] F. Lu and E. Milios. Globally consistent range scan alignment for environment mapping. Au-\n\ntonomous Robots, 4:333\u2013349, 1997.\n\n[4] J. Gutmann and K. Konolige.\n\nIn IEEE\nInternational Symposium on Computational Intelligence in Robotics and Automation (CIRA),\n2000.\n\nIncremental mapping of large cyclic environments.\n\n[5] Harpreet S. Sawhney, Steve Hsu, and Rakesh Kumar. Robust video mosaicing through topology\n\ninference and local to global alignment. In Proc ECCV 2, pages 103\u2013119, 1998.\n\n[6] H.-Y. Shum and R. Szeliski. Construction of panoramic mosaics with global and local align-\n\nment. In IJCV, pages 101\u2013130, February 2000.\n\n[7] A. Shashua. Trilinearity in visual recognition by alignment. In ECCV, pages 479\u2013484, 1994.\n[8] C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factor-\n\nization approach. International Journal of Computer Vision, 9(2):137\u2013154, 1992.\n\n[9] Olivier Faugeras. Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT Press,\n\nCambridge, Massachusetts, 1993.\n\n[10] M. Harville, A. Rahimi, T. Darrell, G.G. Gordon, and J. Wood\ufb01ll. 3d pose tracking with linear\n\ndepth and brightness constraints. In ICCV99, pages 206\u2013213, 1999.\n\n[11] Feng Lu and E. Milios. Robot pose estimation in unknown environments by matching 2d range\n\nscans. Robotics and Autonomous Systems, 22(2):159\u2013178, 1997.\n\n[12] P. J. Besl and N. D. McKay. A method for registration of 3-d shapes. IEEE Trans. Patt. Anal.\n\nMachine Intell., 14(2):239\u2013256, February 1992.\n\n[13] N. Ayache and O. Faugeras. Maintaining representations of the environment of a mobile robot.\n\nIEEE Tran. Robot. Automat., 5(6):804\u2013819, 1989.\n\n[14] Y. Liu, R. Emery, D. Chakrabarti, W. Burgard, and S. Thrun. Using EM to learn 3D models\nIn IEEE International Conference on Machine\n\nof indoor environments with mobile robots.\nLearning (ICML), 2001.\n\n[15] R. Smith, M. Self, and P. Cheeseman. Estimating uncertain spatial relationships in robotics. In\n\nUncertainity in Arti\ufb01cial Intelligence, 1988.\n\n[16] T.P. Minka. Expectation propagation for approximate bayesian inference. In UAI, 2001.\n[17] X. Boyen and D. Koller. Tractable inference for complex stochastic processes. In Uncertainty\n\nin Arti\ufb01cial Intelligence, 1998.\n\n[18] T.P. Minka.\n\nTechnical\n\nreport, Media\n\nLab,\n\nhttp://www.stat.cmu.edu/\u02dcminka/papers/diagrams.html, 1998.\n\nIndependence\n\ndiagrams.\n\n[19] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Mor-\n\ngan Kaufmann, 1997.\n\n[20] A. Rahimi, L-P. Morency, and T. Darrell. Reducing drift in parametric motion tracking.\n\nICCV, volume 1, pages 315\u2013322, June 2001.\n\nIn\n\n[21] T. Kailath, A. H. Sayed, and B. Hassibi. Linear Estimation. Prentice Hall, 2000.\n[22] E. Sudderth. Embedded trees: Estimation of gaussian processes on graphs with cycles. Master\u2019s\n\nthesis, MIT, 2002.\n\n[23] Philip F. McLauchlan. A batch/recursive algorithm for 3d scene reconstruction. Conf. Computer\n\nVision and Pattern Recognition, 2:738\u2013743, 2000.\n\n[24] B. D. Lucas and Takeo Kanade. An iterative image registration technique with an application\nto stereo vision. In International Joint Conference on Arti\ufb01cial Intelligence, pages 674\u2013679,\n1981.\n\n[25] Andrew W. Fitzgibbon and Andrew Zisserman. Automatic camera recovery for closed or open\n\nimage sequences. In ECCV, pages 311\u2013326, 1998.\n\n\f", "award": [], "sourceid": 2341, "authors": [{"given_name": "Ali", "family_name": "Rahimi", "institution": null}, {"given_name": "Trevor", "family_name": "Darrell", "institution": null}]}