Part of Advances in Neural Information Processing Systems 19 (NIPS 2006)
Koby Crammer, Michael Kearns, Jennifer Wortman
We consider the problem of learning accurate models from multiple sources of "nearby" data. Given distinct samples from multiple data sources and estimates of the dissimilarities between these sources, we provide a general theory of which samples should be used to learn models for each source. This theory is applicable in a broad decision-theoretic learning framework, and yields results for classification and regression generally, and for density estimation within the exponential family. A key component of our approach is the development of approximate triangle inequalities for expected loss, which may be of independent interest.