Title: | Understanding the Role of Momentum in Stochastic Gradient Methods |

The reviewers agree that the topic tackled in the paper is interesting and the mathematical results are promising. Overall, this submission is a good attempt in deriving a mathematical understanding of QHM, but the results are often only partially investigated and commented. For instance, in section 3 the main result (i.e. the convergence rate for quadratics) is really hard to parse and is poorly commented in the sense that its practical value is unclear. The paper also makes a number of conjectures that are not backed up and the authors are therefore advised to tone down their claims. This includes "we conjecture that the optimal convergence rate is a monotonically decreasing function of nu" as well as the quality of the approximation in Section 4. In conclusion, all three reviewers liked the paper but also highlighted some shortcomings, therefore justifying acceptance as a poster but not an oral.