Part of Advances in Neural Information Processing Systems 15 (NIPS 2002)
Martin Szummer, Tommi Jaakkola
Classification with partially labeled data requires using a large number of unlabeled examples (or an estimated marginal P (x)), to further con- strain the conditional P (yjx) beyond a few available labeled examples. We formulate a regularization approach to linking the marginal and the conditional in a general way. The regularization penalty measures the information that is implied about the labels over covering regions. No parametric assumptions are required and the approach remains tractable even for continuous marginal densities P (x). We develop algorithms for solving the regularization problem for finite covers, establish a limiting differential equation, and exemplify the behavior of the new regulariza- tion approach in simple cases.