Part of Advances in Neural Information Processing Systems 18 (NIPS 2005)
Koby Crammer, Michael Kearns, Jennifer Wortman
We initiate the study of learning from multiple sources of limited data, each of which may be corrupted at a different rate. We develop a com- plete theory of which data sources should be used for two fundamental problems: estimating the bias of a coin, and learning a classifier in the presence of label noise. In both cases, efficient algorithms are provided for computing the optimal subset of data.