Nuria Oliver, Barbara Rosario, Alex Pentland
We describe a real-time computer vision and machine learning sys(cid:173) tem for modeling and recognizing human actions and interactions. Two different domains are explored: recognition of two-handed motions in the martial art 'Tai Chi' , and multiple- person interac(cid:173) tions in a visual surveillance task. Our system combines top-down with bottom-up information using a feedback loop, and is formu(cid:173) lated with a Bayesian framework. Two different graphical models (HMMs and Coupled HMMs) are used for modeling both individual actions and multiple-agent interactions, and CHMMs are shown to work more efficiently and accurately for a given amount of train(cid:173) ing. Finally, to overcome the limited amounts of training data, we demonstrate that 'synthetic agents' (Alife-style agents) can be used to develop flexible prior models of the person-to-person inter(cid:173) actions.