Generalization Properties of Learning with Random Features

Part of Advances in Neural Information Processing Systems 30 (NIPS 2017)

Bibtex »Metadata »Paper »Reviews »Supplemental »

Authors

Alessandro Rudi, Lorenzo Rosasco

Abstract

We study the generalization properties of ridge regression with random features in the statistical learning framework. We show for the first time that $O(1/\sqrt{n})$ learning bounds can be achieved with only $O(\sqrt{n}\log n)$ random features rather than $O({n})$ as suggested by previous results. Further, we prove faster learning rates and show that they might require more random features, unless they are sampled according to a possibly problem dependent distribution. Our results shed light on the statistical computational trade-offs in large scale kernelized learning, showing the potential effectiveness of random features in reducing the computational complexity while keeping optimal generalization properties.