Part of Advances in Neural Information Processing Systems 35 (NeurIPS 2022) Main Conference Track
Kumar Kshitij Patel, Lingxiao Wang, Blake E. Woodworth, Brian Bullins, Nati Srebro
We study the problem of distributed stochastic non-convex optimization with intermittent communication. We consider the full participation setting where $M$ machines work in parallel over $R$ communication rounds and the partial participation setting where $M$ machines are sampled independently every round from some meta-distribution over machines. We propose and analyze a new algorithm that improves existing methods by requiring fewer and lighter variance reduction operations. We also present lower bounds, showing our algorithm is either $\textit{optimal}$ or $\textit{almost optimal}$ in most settings. Numerical experiments demonstrate the superior performance of our algorithm.