Smoothness, Low Noise and Fast Rates

Part of Advances in Neural Information Processing Systems 23 (NIPS 2010)

Bibtex Metadata Paper Supplemental


Nathan Srebro, Karthik Sridharan, Ambuj Tewari


We establish an excess risk bound of O(H Rn^2 + sqrt{H L*} Rn) for ERM with an H-smooth loss function and a hypothesis class with Rademacher complexity Rn, where L* is the best risk achievable by the hypothesis class. For typical hypothesis classes where Rn = sqrt{R/n}, this translates to a learning rate of ̃ O(RH/n) in the separable (L* = 0) case and O(RH/n + sqrt{L* RH/n}) more generally. We also provide similar guarantees for online and stochastic convex optimization of a smooth non-negative objective.