Never Go Full Batch (in Stochastic Convex Optimization)

Amir, Idan; Carmon, Yair; Koren, Tomer; Livni, Roi

Never Go Full Batch (in Stochastic Convex Optimization)

Idan Amir, Yair Carmon, Tomer Koren, Roi Livni

Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

Bibtex Paper Reviews And Public Comment »

Abstract

We study the generalization performance of $\text{\emph{full-batch}}$ optimization algorithms for stochastic convex optimization: these are first-order methods that only access the exact gradient of the empirical risk (rather than gradients with respect to individual data points), that include a wide range of algorithms such as gradient descent, mirror descent, and their regularized and/or accelerated variants. We provide a new separation result showing that, while algorithms such as stochastic gradient descent can generalize and optimize the population risk to within $\epsilon$ after $O(1/\epsilon^2)$ iterations, full-batch methods either need at least $\Omega(1/\epsilon^4)$ iterations or exhibit a dimension-dependent sample complexity.

Abstract

Name Change Policy