Asynchronous stochastic convex optimization: the noise is in the noise and SGD don't care

Part of Advances in Neural Information Processing Systems 28 (NIPS 2015)

Bibtex Metadata Paper Reviews

Authors

Sorathan Chaturapruek, John C. Duchi, Christopher Ré

Abstract

We show that asymptotically, completely asynchronous stochastic gradient procedures achieve optimal (even to constant factors) convergence rates for the solution of convex optimization problems under nearly the same conditions required for asymptotic optimality of standard stochastic gradient procedures. Roughly, the noise inherent to the stochastic approximation scheme dominates any noise from asynchrony. We also give empirical evidence demonstrating the strong performance of asynchronous, parallel stochastic optimization schemes, demonstrating that the robustness inherent to stochastic approximation problems allows substantially faster parallel and asynchronous solution methods. In short, we show that for many stochastic approximation problems, as Freddie Mercury sings in Queen's \emph{Bohemian Rhapsody}, ``Nothing really matters.''