NeurIPS 2020

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

Meta Review

Originally, the paper got three positive scores: 7,7,6, all with very high confidences. Basically, there was no critical issues raised by the reviewers, except suggesting the enhancement on experiments and clarifying some points. During discussion, all the reviewers agreed that the paper should be accepted and Reviewer #3 raised his/her score to 7. So all the reviewers reached a consensus. Thus the AC decided to accept the paper. However, the AC also have the following comments after a quick reading. Hope the authors could take into account when revising the paper. 1. The main concern is the convergence results (Theorems 3.3 and 3.6) in this paper. The proposed algorithms including Algorithms 2 and 3 are designed to solve the problem (3.2), which is different from Problem (1.1) or Problem (1.2). What’s the gap between them? In particular, the results in Theorems 3.3 and 3.6 hold due to the strong convexity. However, this important assumption is missing. 2. The complexity comparison in Table 1 is questionable. As stated in this paper, both the proposed algorithm and the algorithm in [24] have the same iteration complexity, O(1/\epsilon^4), for solving general nonconvex-concave problems. Therefore, the authors should report the results for a fair comparison. 3. For the same class of nonconvex-concave problems, both the proposed algorithm and the algorithm in [24] have an identical convergence rate. What’s the advantage of the proposed algorithm? 4. There is an important parameter p in the model (3.2). How to choose it for the theoretical results and experiments? 5. The authors introduce an auxiliary variable z and a momentum acceleration step for the update of z. It is not clear that what the contribution of the momentum acceleration is. 6. In Line 142, what’s \beta? And what’s the difference between \beta and \beta_t in Algorithms 2 and 3, as well as in Theorem 3.3 ad Theorem 3.6? 7. The experimental analysis is less convincing. The authors should give the experimental results of the algorithms in the related work [22, 23, 24], which are used to verify the convergence results and advantages of the proposed algorithm. 8. The authors should define some symbols, e.g., what’s \beta_t in Algorithms 2 and 3. 9. There are many key errors and typos. Line 22, “inof” should be “of”. Line 87, “Section 5” should be “Section 2”. Line 144, “see 3” should be “see Algorithm 3”.