Combined discriminative and generative articulated pose and non-rigid shape estimation

Part of Advances in Neural Information Processing Systems 20 (NIPS 2007)

Bibtex Metadata Paper


Leonid Sigal, Alexandru Balan, Michael Black


Estimation of three-dimensional articulated human pose and motion from images is a central problem in computer vision. Much of the previous work has been limited by the use of crude generative models of humans represented as articu- lated collections of simple parts such as cylinders. Automatic initialization of such models has proved difficult and most approaches assume that the size and shape of the body parts are known a priori. In this paper we propose a method for automatically recovering a detailed parametric model of non-rigid body shape and pose from monocular imagery. Specifically, we represent the body using a param- eterized triangulated mesh model that is learned from a database of human range scans. We demonstrate a discriminative method to directly recover the model pa- rameters from monocular images using a conditional mixture of kernel regressors. This predicted pose and shape are used to initialize a generative model for more detailed pose and shape estimation. The resulting approach allows fully automatic pose and shape recovery from monocular and multi-camera imagery. Experimen- tal results show that our method is capable of robustly recovering articulated pose, shape and biometric measurements (e.g. height, weight, etc.) in both calibrated and uncalibrated camera environments.