Review for NeurIPS paper: Relative gradient optimization of the Jacobian term in unsupervised deep learning

NeurIPS 2020

Relative gradient optimization of the Jacobian term in unsupervised deep learning

Meta Review

The focus of the work is deep density estimation (also called normalizing flows). Particularly, the authors focus on the generative model x=f(s) as defined in (1) where the observation (x) is described as the invertible non-linear function (f) of a latent variable (s). They take a maximum-likelihood perspective (2) where g_{\theta}, the approximation of the inverse of f, is the composition of g_1=\sigma_1(W_1 \cdot), ..., g_L=\sigma_L(W_L \cdot) invertible and differentiable component functions. They propose to use the relative gradient method to optimize \theta to speed up computations. Deep density estimation is an important problem in machine learning. While the proposed relative gradient approach is widely-applied for instance in the independent component analysis literature, the reviewers agreed that its adaptation to the deep density estimation task is interesting and can be of practical interest.