Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The reviewers agree that there exist some interesting technical details in the paper, but they raise concerns regarding the novelty of the technique and the absence of end-to-end feature learning in some of the experiments. That said, the introduction of the lambda term to regularize the Hessian even though straightforward is likely to lead to more stable meta-learning and non-trivial performance gains. Accordingly I recommend accept as a poster. For the record, I asked for an additional unofficial feedback from another expert in the field and they provided me with the following comments: " I would give it a weak accept. The main contribution seems to be a gradient update wrt to hyperparameters. Although it is very similar to related work, I still think its novel and the addition of the lambda term to regularize the Hessian can potentially be very useful in practice since its common to see wild oscillations in this update (without the lambda term). My only criticism is that the math in Page 4 is a bit hand-wavy. In many scenarios, the author makes approximations that then he takes as the true value. For example, the gradient in Eq. (9) are not the true gradients of x^*, but of its quadratic approximation (assumed in Eq. (5)), but this is presented as the true quantity. If I were reviewing the paper I would ask the authors to introduce a new notation to account for this approximation or include the error bounds in these quantities. I agree with R1 that the paper does not do justice to the literature on hyperparameter optimization. One of the most comprehensive references IMO not mentioned is https://arxiv.org/pdf/1806.04910.pdf "