Improved Algorithms for Convex-Concave Minimax Optimization

Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

AuthorFeedback Bibtex MetaReview Paper Review Supplemental

Authors

Yuanhao Wang, Jian Li

Abstract

This paper studies minimax optimization problems $\min_\x \max_\y f(\x,\y)$, where $f(\x,\y)$ is $m_\x$-strongly convex with respect to $\x$, $m_\y$-strongly concave with respect to $\y$ and $(L_\x,L_{\x\y},L_\y)$-smooth. Zhang et al. \cite{zhang2019lower} provided the following lower bound of the gradient complexity for any first-order method: $\Omega\Bigl(\sqrt{\frac{L_\x}{m_\x}+\frac{L_{\x\y}^2}{m_\x m_\y}+\frac{L_\y}{m_\y}}\ln(1/\epsilon)\Bigr).$ This paper proposes a new algorithm and proved a gradient complexity bound of $\Tilde{O}\Bigl(\sqrt{\frac{L_\x}{m_\x}+\frac{L\cdot L_{\x\y}}{m_\x m_\y}+\frac{L_\y}{m_\y}}\ln\left(1/\epsilon\right)\Bigr),$ where $L=\max\{L_\x,L_{\x\y},L_\y\}$. This improves over the best known upper bound $\Tilde{O}\left(\sqrt{\nicefrac{L^2}{m_\x m_\y}} \ln^3\left(1/\epsilon\right)\right)$ by Lin et al. \cite{lin2020near}. Our bound achieves linear convergence rate and tighter dependency on condition numbers, especially when $L_{\x\y}\ll L$ (i.e., the weak interaction regime). Via simple reduction, our new bound also implies improved bounds for strongly convex-concave problems and convex-concave problems. When $f$ is quadratic, we can further improve the bound to $O\Bigl(\sqrt{\frac{L_\x}{m_\x}+\frac{L_{\x\y}^2}{m_\x m_\y}+\frac{L_\y}{m_\y}}\left(\frac{L^2}{m_\x m_\y}\right)^{o(1)}\ln(1/\epsilon)\Bigr)$, which matches the lower bound up to a sub-polynomial factor.