NeurIPS 2020

Towards Theoretically Understanding Why Sgd Generalizes Better Than Adam in Deep Learning

Meta Review

Dear authors, Thank you for your efforts and time. The paper was well-received by the reviewers. Discussions during the rebuttal phase raised the following issue: The theoretical analysis of SGD-M is still not sufficient, and reviewers would like to see more clear justifications for SGD-M in the updated revision. Best AC