NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:277
Title:Batched Multi-armed Bandits Problem

The paper contributes significant results that quantify the impact, and optimal use, of partial information due to sub-sampling in stochastic multiarmed bandits -- an important class of online learning problems. In a sense this is an extension of partial information in the "space" domain (bandit arm feedback) to the "time" domain, where it is not possible to collect a sample of feedback in every round. The submission was unanimously appreciated by all reviewers and this was also reflected in the post-response discussion that ensued among the reviewers. [This meta-review was reviewed and revised by the Program Chairs]