Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data

Nadler, Boaz; Srebro, Nathan; Zhou, Xueyuan

Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data

Boaz Nadler, Nathan Srebro, Xueyuan Zhou

Advances in Neural Information Processing Systems 22 (NIPS 2009)

Abstract

We study the behavior of the popular Laplacian Regularization method for Semi-Supervised Learning at the regime of a fixed number of labeled points but a large number of unlabeled points. We show that in $\R^d$, $d \geq 2$, the method is actually not well-posed, and as the number of unlabeled points increases the solution degenerates to a noninformative function. We also contrast the method with the Laplacian Eigenvector method, and discuss the ``smoothness assumptions associated with this alternate method.

Abstract

Name Change Policy