NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:7485
Title:Globally Optimal Learning for Structured Elliptical Losses

Reviewer 1

This paper investigates structured regression tasks with generalized loss functions. It places a number of existing loss functions (e.g., quadratic, Huber) as special cases within the class of elliptical losses. The problem and formulation are well motivated and clearly described. The approach is demonstrated on synthetic data to show that as the data generating distribution becomes more heavily tailed, the robust estimation is more beneficial. Only one comparison loss (Tyler) is shown in the interest of readability, but it would be nice to show that at least one of the other robust losses (e.g., Huber) also has a benefit. In addition (or in lieu of the additional comparison), the connection with line 138 might be emphasized in the experiment section. The other experiments (stock market and river discharge) are more compelling and demonstrate the advantages in diverse application domains. Overall, I think this is a solid paper with no obvious weaknesses.

Reviewer 2

[originality] I believe there is a clear novelty in the proof of global optimality of all stationary points for some important elliptical losses and linear structured models while the part of proof (e.g. Lemma 1) is incrementally built on previous work. [clarity] While the problem is well-motivated and formulated clearly, there are some parts in the paper that are not clear. 1. Based on related work, it’s hard to contextualize their work in the existing literature. The readers may have the following important questions. * How does the existing literature for unstructured elliptical losses connect to your work? * What is the main difference in discovered theoretical characteristics of unstructured elliptical losses from that of structured ones? 2. It lacks the reason how the proved global optimality leads to more efficient optimization while this is claimed as one of their main contributions. A clear explanation is needed for this. [significance] 1. This work is significant in that they provide optimality proof that leads to more efficient optimization method for a wide range of robust elliptical losses including Gaussian, Generalized Gaussian, Huber, etc. They still need to clarity how optimality result in critical points enable more efficient optimization. 2. However, they did not present the comparison between their efficient optimization algorithm and less efficient one under the same robust loss function, which leads to empirical justification for more effective optimization method and the practical impact of their optimality proof. ---------------------------------------------------------------------------------------------- Overall, I think this is an interesting paper and the authors did contextualize how significant this work is to the related work in their response and am changing the score.

Reviewer 3

--- after author response: --- Thank you to the authors for all the valuable clarifications, it seems like they would strengthen the manuscript. I spotted a minor typo in the appendix: On the line "where the equality to 0 is true because [...]", the first \tilde{v} should be bold. --- original review: --- The paper proposes non-Gaussian MRF likelihood learning for robust regression. A global convergence result is derived. Overall this is a convincing paper with a rather nice result that seems to work well in practice. Originality: 5/5 The paper extends results from literature on spherically-invariant random vectors [1 in paper] to more general losses / likelihoods. The application to obtaining a global convergence result for robust elliptical MRFs seems original. Quality: 4/5 The paper provides both a theoretical result supporting the practical use of the proposed non-convex optimization, a description of an algorithm to solve it, as well as convincing evaluations showing good performance of robust learning especially with few samples. Deducting one point because I think a derivation of the MM algorithm used should be provided, rather than simply giving the form of the algorithm. Clarity: 4/5 The overall presentation is very well written and easy to follow. I did not manage to understand some of the technical details surrounding Lemma 1 and its proof, and will ask for clarifications below. Significance: 5/5 Robust estimation seems like a likely improvement in many cases over gaussian MRFs, and this paper provides everything needed to perform it effectively.