Convex Deep Learning via Normalized Kernels

Part of Advances in Neural Information Processing Systems 27 (NIPS 2014)

Bibtex Metadata Paper Reviews Supplemental

Authors

Özlem Aslan, Xinhua Zhang, Dale Schuurmans

Abstract

Deep learning has been a long standing pursuit in machine learning, which until recently was hampered by unreliable training methods before the discovery of improved heuristics for embedded layer training. A complementary research strategy is to develop alternative modeling architectures that admit efficient training methods while expanding the range of representable structures toward deep models. In this paper, we develop a new architecture for nested nonlinearities that allows arbitrarily deep compositions to be trained to global optimality. The approach admits both parametric and nonparametric forms through the use of normalized kernels to represent each latent layer. The outcome is a fully convex formulation that is able to capture compositions of trainable nonlinear layers to arbitrary depth.