Reviews: Convergence of Adversarial Training in Overparametrized Neural Networks

EDIT: I have read the author feedback and the authors have agreed to revise the writing. This is clearly a good paper that should be accepted. Two more comments regarding the rebuttal: (1) My original comments apply to natural training as well, and I understand this is a very challenging topic. (2) I understand one can reduce the dependence on H, but my concern was that probably more parameters are needed to more precisely characterize the landscape (for both natural and adversarial training). For example, one such parameter is width. -------------------- This is an original paper that develops convergence results (of training error) of adversarial training for extremely wide networks. As far as I know, this is also first such results for adversarial training. The high-level strategy of the proof follows the recent paradigm of analyzing extremely wide networks for natural training (more precisely, through NTK), where the behavior of the networks becomes almost linear, and thus is amenable to precise analysis. This is a good paper, I like it, and vote to accept. That being said, the current convergence theory is quite unsatisfying (I would say the same thing for the convergence theory we currently have for natural training). Among many things, I want to highlight two things that are particularly unsatisfying to me: 1. Depth of the networks *only* hurts convergence in this theory, and it is unavoidable. Let's start by observing that the current dependence on H is 2^{O(H)} (this is despite of the fact that the required width depends linearly on the number of training points, which is a separate thing), and moreover, as H increases, we will strictly need (exponentially) wider networks. This is very counter-intuitive, because empirically, theoretically (e.g., from the research of circuit complexity), and intuitively, H should at least sometimes help, rather than only hurting the convergence. Moreover, such a "monotonic dependence on H" is unavoidable because H is the *only* architectural parameter used by the theory to characterize convergence. And it is impossible to get rid of this monotonic dependence without injecting more parameters. For example, one could imagine the following theory by measuring one more parameter as a complexity measure: Suppose that we try to interpolate and recover certain hidden function f through data, and given depth H, the optimal neural network for representing f has width w_H(f). In this case, one could hope that for some good f, the width requirement for overparameterized networks to converge is a function of w_H(f), instead of H. Note that, w_H(f) could potentially be much smaller than H. For example, if we study f=XOR, then as H gets large enough, w_H(f) = O(1), and is independent of H, and so, even if we have a theory where the overparameterized network is of width 2^{w_H(XOR)}, it is a constant for large H (2^{w_H(f)} = 2^{O(1)} = O(1)), and thus give huge improvement. There are many things that can be explored in such an extended theory -- and I believe that that will lead to a much more reasonable theory. 2. From this theory, what are the "testable hypotheses" for guiding empirical work on neural networks? I think a downside with the current "foundations of deep learning" research is that it is more like "explanation for explanation's purpose", and lacks "predictions". I believe that a good theory should be able to predict something useful and guide the practice (which is critical and common in say a physics theory). Yet, I am not sure what the current theory can say about this ( maybe there is but it is not written in the paper).

EDIT: I have read the author feedback, and the authors have explicitly discussed the (minor) issues I raised. They have agreed to clarify the wording of heuristic v. exact inner maximization, and be more forthright about the success of adversarial training having to do with the expressivity of the neural tangent kernel. They have even gone beyond what I expected and discussed removing the projection step for two-layer networks. As such, I have increased my score from a 7 to an 8. I believe this is a strong paper. -------------- The paper is to the best of my knowledge, wholly original. The work ties together many important theoretical tools that have been developed over the last year (overparameterized neural networks, neural tangent kernels, adversarial robustness, etc.) and ties them together in an important and interesting way. Moreover, this is the first theoretical result (to my knowledge) that supports the idea of using adversarial training to find robust neural networks, amplifying its originality. The paper is very well-written. The authors do a good job of detailing their contributions, connecting it to related work, and giving proof sketches. The theorems seem to be technically sound. The authors are generally honest about the drawbacks of their theory. One issue I have here is that the authors purport to show multiple times that neural networks can find robust classifiers. However, this holds only under assumption 5.1. Due to the complexity of the neural tangent kernel, it is not clear whether this assumption is likely to hold. I would argue that the authors should be forward facing about this in the introduction, in the interest of transparency. The clarity of the work is generally very high. The paper is well-organized, as are the proofs. I believe the paper should also be commended for taking relatively modern theory on the convergence of gradient descent for neural networks, and managing to simplify and expand upon this work, in a way that is eminently comprehensible. My one issue with clarity is that some of the contributions of the paper could be expressed in clearer ways. For example, the second contribution (end of page 1, beginning of page 2) refers to a "data dependent RF norm", which is not explained till much later. Another somewhat confusing part is in the abstract, which differentiates "exact" and "heuristic" solving of the inner maximization of adversarial training. However, this distinction can be omitted for clarity, as is done in the first bullet point contribution, which I personally think would benefit the paper. I do not think this is necessary by any means, it is simply a suggestion I wanted to make. The paper's significance could be subject to some debate. It is not directly useful at a practical level, except in the sense that it gives some evidence towards the efficacy of adversarial training. Moreover, even the formal theorems of the paper require assumptions that are not met in practice, such as having exponentially wide networks. However, I would argue that this paper is still significant. First, quantifying even the training error, much less the robust training error, of a neural network is difficult, and this paper is able to make progress on this. Similarly, the paper implicitly develops techniques that can aid other theoretical analyses of neural networks. The work could help other adversarial training papers, as it expands the traditional scope of adversarial training (to so-called perturbation sets and functions) and shows that these more general adversarial training methods can still be analyzed. The paper also helps expand upon recent and exciting work on neural tangent kernels. Finally, it makes simple but important connections between VC dimension and adversarial robustness, which again can help inspire more theory. The above reads like a laundry list of various theoretical tools developed for understanding machine learning theoretically, and I believe the paper is significant for tying these all together.

Paper ID:	7140
Title:	Convergence of Adversarial Training in Overparametrized Neural Networks

Reviewer 1

Reviewer 2

Reviewer 3