Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The paper proposes to automate the tuning of learning rate schedules in stochastic gradient methods, which is an important problem. In this regards, the authors propose a statistical test to determine when to decay the learning rate. The statistical test build upon a prior work with simple albeit useful extensions. Resulting statistical test is simple and can be deployed easily. There are some concerns regarding mismatch between theoretical assumptions made and the setup in practice. Nevertheless, empirically the learning rate schedule followed by decaying when the test is true seems to be almost competitive with hand-tuned methods. Thus, I am recommending an acceptance to NeurIPS. For the camera ready version, we would like the authors to re-evaluate their experimental results. Please make sure the standard train-test splits are used. Kindly repeat experiments a few times to get variability and error bars. Also ensure the comparisons made are in the correct ball park of known performance of the models (e.g. for wiki-text 2 perplexity seems a bit too high for the model used and accuracy of resnet-18-v1 from the cited paper on cifar10 seems to be too good).