Reviews: Efficient Pure Exploration in Adaptive Round model

------------------------------------------ After author feedback: Thanks to the authors for their response. This is a paper with strong theoretical results which are well supported by the experiments. ------------------------------------------- Original review: Summary: The object of study is the round complexity of (epsilon, delta)-PAC best-k-arms identification. In that particular instance of the best-k-arms bandit problem, an algorithm successively samples arms, with the goal of finally returning a set of k arms, which with probability bigger than 1-delta verifies that the mean or the j^th arm of the set is within epsilon of the j^th highest mean. The focus is on algorithms which have a small number of rounds at which they make decisions: at each such round, the algorithm queries a number of samples, and no adaptive decision is made between rounds. An algorithm is good if the total number of samples it requires is small (sample complexity) and if the number of rounds is small (round complexity). New algorithms and complexity bounds are provided for two settings: either the algorithm is free to choose the number of rounds, or that number is set in advance. An adaptation to exact top-k arm identification (epsilon=0) is also presented. Strengths: - Strong theoretical guarantees: the algorithms have both close to optimal sample complexities and round complexities. These algorithms are the first ones with such round complexities which do not require additional information on the problem. - Convincing experimental evaluation. The new algorithms improve over existing ones by order of magnitude with respect to the round complexity, while having sample complexity of comparable order to the best ones. The results on the exact top-k problem are particularly striking. - Clear presentation of the algorithms, in particular on page 4. Weaknesses: - the description and comparison of the different existing top-k settings could be clearer. For example, some papers require the means of all the k arms to be above the k^th highest mean, while this paper compares each of the k arms to different reference arms. This is mentioned in the related work section at the end of the paper, but the clarity would be enhanced if it was explained in the introduction. - The references to existing results could be more precise. For example the lower bound mentioned line 168 could be cited more precisely. Instead of referring to papers [16] and [14] without more precisions, the authors could write for example [16, Theorem X]. Lines 100-102, a notion of "near-optimality" coming from a previous paper is also referred to in a vague way. Remarks: - I think that the notation log^*(a) is defined, but not log^*_b(a). It took me a moment to realize that this was the logarithm in base b. - Instead of calling the problems 1, 2 and 3, it would be easier for the reader to have them be called by names. Writing "exact-top-k" is not much longer than "problem 3" and is much more descriptive.

Reviewer 2

- The existing algorithms need \log^\star(n) rounds. Given the slow growth of $\log^\star(n)$, trying to further reduce this number does not seem that interesting in practice. This number is smaller than 5 for $n \le 2^{65536}$. - While there might exist differences in the setting, the authors of [4] claim that minimal round complexity scales at least as $\Omega( \log^\star(n))$, so that the authors should clarify how their setting differs from [4], as otherwise their results would simply prove that [4] contains a mistake. - As a minor remark, it could be good for clarity to explicity define the $\log^\star(n)$ function for an arbitrary base (other than 2). - Also, the authors mention that the algorithm proposed in [4] requires the knowledge of a number $\Delta$, such that $\Delta_k > \Delta$. Therefore, a straightforward approch to extend it to the present setting (where $\Delta$ is unknown) would be to apply the algorithm of [4] for \Delta = 1, then check that $\Delta_k > \Delta$ by sampling. It is noted that one can always estimate $\Delta_k$ by sampling all arms enough time (and this takes one "round") If the test fails, do the same for $\Delta = \gamma$, $\Delta = \gamma^2$ (where $\gamma < 1$) and so on and so forth until the test succeeds. Would such an approach fail and if not why ?

Reviewer 3

This paper focused mainly on the PAC top-k-subset selection problem in an adaptive round model, which means the authors are not only interested in the sample complexity of the proposed algorithm, but also try to achieve minimal number of rounds used. The motivation of this setting has been stated and the main text is globally well written. The algorithms are elimination-based that try to kill as much arms as they can in each round to reduce the round complexity. It appears that the query complexity in the fixed-confidence setting stays optimal and shows a good empirical performance as well. I have a major technique concern, however, regarding the proof of Lemma 1. In equation 7, you are using a union bound over all iterations r. But the Hoeffding bound only holds for iteration r if i^* has been updated at this iteration. What if it is not updated at this iteration? \hat{\theta_i^*} >= \theta_i^* - \epsilon/8 holds for iteration r only when it holds for iteration r-1, but that cannot be true from the very beginning. Maybe I am missing something, but I really need more clarification on this part, which seems to be a crucial lemma in this paper. Otherwise, regarding the experiments, have you ever thought of comparing \deltaE to some other fixed confidence BAI algorithms and \deltaER to some fixed budget BAI algorithms. Of course, it is not comparable straightforwardly, but have you thought of proposing some naive adaptation of those algorithms to your setting? Minor: - I would suggest the authors to make the motivation more clear as you only stated the use of muliple testing for several applications but didn't really explain why we need to do so in the real world - The Exponential-Gap-Elimination alogorithm could have been detailed in the paper (or appendix), otherwise the paper is not self-contained Update after author feedback: I am convinced by the author's feedback regarding the proof, and thus increased the score. In summary, this paper has some strong theoretical results that worth being published in NeurIPS.

Paper ID:	3579
Title:	Efficient Pure Exploration in Adaptive Round model

Reviewer 1

Reviewer 2

Reviewer 3