NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 6304 Bayesian Optimization with Unknown Search Space

### Reviewer 1

Applying Bayesian optimization to expensive black-box problems needs to specify the bound of search space. However, when tackling a completely new problem, there is no prior knowledge to guarantee that the specified search space contains the global optimum. The paper proposes an approach to deal with this situation. In the approach, the user first specifies an initial search space; then the bound of search space automatically expands as the iteration proceeds; finally the algorithm will return a solution achieving \epsilon-accuracy. The key is how to expand the search space. The approach uses upper confidence bound (UCB) as acquisition function, and selects the expanded search space containing at least one point whose acquisition function value is within \epsilon from the acquisition function maximum. Also because the gap between the best selected point and the global optimum of the current search space can be bounded, the approach can return a solution achieving \epsilon-accuracy. I have checked all the proofs, and believe most of them are correct. However, I also found some places which are unclear and may be incorrect. Questions: 1. In the proof of Theorem 4.1, \epsilon is set less than \| \sqrt(\beta_t)\theta -\min_{x\in R^d}a_{UCB}(x;D_{t-1})\|. What if \| \sqrt(\beta_t)\theta -\min_{x\in R^d}a_{UCB}(x;D_{t-1})\|=0? 2. \gamma is the minimum value among a set of numbers above eq(14), but is a specific value in eq(14). How do you determine its value in eq(14)? 3. Why do the second and the third inequalities hold in eq(15)? 4. How can eq(15) hold with probability at least 1-\delta? max_{x\in D_t}a_{LCB} is used to replace max_{x\in D_t}f(x) in the second inequality. Did you consider the probability for max_{x\in D_t}a_{LCB}<= max_{x\in D_t}f(x) to hold? Question about experiments: 1. a_k and b_k in \beta_t are unknown. What are their values in your experimental settings? Minor comments: 1. The posterior mean in eq(1) is not correct. “y” -> “y-m”? 2. Proposition 5.1 is proposed in the paper, but it is presented as a lemma in the supplementary.