Zoubin Ghahramani, Michael Jordan
Real-world learning tasks may involve high-dimensional data sets with arbitrary patterns of missing data. In this paper we present a framework based on maximum likelihood density estimation for learning from such data set.s. VVe use mixture models for the den(cid:173) sity estimates and make two distinct appeals to the Expectation(cid:173) Maximization (EM) principle (Dempster et al., 1977) in deriving a learning algorithm-EM is used both for the estimation of mix(cid:173) ture components and for coping wit.h missing dat.a. The result(cid:173) ing algorithm is applicable t.o a wide range of supervised as well as unsupervised learning problems. Result.s from a classification benchmark-t.he iris data set-are presented.