Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Essentially the paper states an algorithm page 3 and the rest of the paper is based on explaining the "proof strategy", i.e., the authors explain the various results which are used to build the main convergence-rate theorem. I wonder how many NeurIPS readers will be able to get anything out of pages 3-7 which describes results from geometry. While the result might be of theoretical significance, the algorithm requires knowledge of the exponential map which is usually a very complicated operator. The main non-trivial manifolds for which the exponential map is actually known are the Grassmannian and Stiefel manifolds, hence I wonder if the authors should have just focused on these two manifolds which would considerably simplify the paper; and by specialising to these the results can probably be improved. I personally don't see much point in describing more general results (unless it's aimed at a more mathematical audience or in the appendices) unless the authors can generalise their results to more general retractions, which would improve the applicability of the results.
In this paper the authors propose an algorithm for optimization over manifolds, capable of converging to second-order stationary points, with a rate of convergence of 1/eps^2 The presentation of the paper in general, and the results in particular, is quite clear. I particularly really liked the introduction (despite some typos), and the review of existing work and results. The proofs seem to be correct, although I didn't follow carefully the supplementary material. I appreciate the presence of examples, but the analysis there is somehow poor. What about the complexity of each manifold? How does this affect the constants in your results? What about the rate of convergence? In the Reproducibility Checklist answers, the authors say that the code is available for dowloading. However, I couldn't find it in the paper. Minor comments: There are some typos and weird sentences, specially in the introduction. For instance - "Riemnnian manifolds, but not clear about the complexity..." - escapes (an approximate) saddle pointS Also in the footnote 2, it doesn't specify that \lambda_min is the eigenvalue of the Hessian. I think that the enumeration in page 5 (after Lemma 1) is not necessary.
The authors analyse gradient methods on Riemannian manifolds and prove that their suggested algorithm escapes saddle points with a specific rate. This work was technical but clearly written. The quality is good. The work is significant. Only limited experiments were conducted.