Reviews: Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks

The paper derives generalization bounds for overparametrized deep residual networks learned by gradient descent from random initialization. All reviewers appreciate the importance of the topic of the paper. However, R1 and R3 feel that the contribution is too close to prior art, including [5] [24] and another NeurIPS submission. On the other hand, R2 thinks that the contributions relative to prior art are meaningful and vouches for acceptance. The rebuttal successfully addresses the differences: [24] focuses on optimization with squared loss, while this submission focuses on generalization with cross entropy loss. The Wide and Deep paper focuses on optimization and generalization for fully connected networks trained by SGD, while this paper focuses on residual networks trained with gradient descent. This AC sides with R2 assessment that there are enough differences relative to prior art to justify acceptance.

Paper ID:	8393
Title:	Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks