NeurIPS 2020

Random Reshuffling is Not Always Better

Meta Review

This paper received 4 reviews. Three of them consider the contribution of the paper are of significant interest to the ML community, and another reviewer thinks this paper does not give meaningful conclusion concerning comparison between with and without replacement sampling in SGD. I agree with the majority of the reviewers about the significance of the results on providing concrete counter examples for the two conjectures about matrix Arithmetic and Geometric mean inequalities, thus recommend acceptance of the paper. On the other hand, I agree with Reviewer #3 that the paper does not have concrete conclusion comparing with and without replacement sampling is SGD, and the paper's claim that "it is generally believed that ... random reshuffling causes learning algorithms to converge faster" is not precise. It is a phenomenon that people often observe, but I doubt many people has the belief that random reshuffling is "always better". The authors' effort in constructing counter examples to disprove the "general belief" is appreciated, but I think there are many like Reviewer #3 (including myself) having the experience that neither would always be better in practical SGD methods and considering the counter examples on SGD in sections 3-5 artificial and maybe unnecessary. I view the two opinions not necessarily contradicting each other. The conjectures on matrix arithmetic and geometric inequalities are abstractions from practical SGD variants on special loss functions. Construction of concrete counter examples for these mathematical conjectures is of major theoretical interest, but extending such constructions to demonstrate consequences for SGD in practice may involve other factors and less convincing. I highly recommend the authors to clarify such distinctions in the revision.