Joint Tracking of Pose, Expression, and Texture using Conditionally Gaussian Filters

Part of Advances in Neural Information Processing Systems 17 (NIPS 2004)

Bibtex Metadata Paper


Tim Marks, J. Roddey, Javier Movellan, John Hershey


We present a generative model and stochastic filtering algorithm for si- multaneous tracking of 3D position and orientation, non-rigid motion, object texture, and background texture using a single camera. We show that the solution to this problem is formally equivalent to stochastic fil- tering of conditionally Gaussian processes, a problem for which well known approaches exist [3, 8]. We propose an approach based on Monte Carlo sampling of the nonlinear component of the process (object mo- tion) and exact filtering of the object and background textures given the sampled motion. The smoothness of image sequences in time and space is exploited by using Laplace's method to generate proposal distributions for importance sampling [7]. The resulting inference algorithm encom- passes both optic flow and template-based tracking as special cases, and elucidates the conditions under which these methods are optimal. We demonstrate an application of the system to 3D non-rigid face tracking.