Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
This paper investigates how the regularization helps for training neural networks in contrast to the unregularized neural tangent kernel method. It is shown that regularization captures "informative signal" but the NTK model does not, which highlights the effectiveness of the regularization. Moreover, this paper shows polynomial time convergence of gradient flow corresponding to the infinite width neural network. The contribution is novel and the implication is quite instructive to neural tangent kernel learning. Especially, the lower bound evaluation for kernel learning is a novel contribution.