Ian Char, Youngseog Chung, Willie Neiswanger, Kirthevasan Kandasamy, Andrew Oakleigh Nelson, Mark Boyer, Egemen Kolemen, Jeff Schneider
In black-box optimization, an agent repeatedly chooses a configuration to test, so as to find an optimal configuration.
In many practical problems of interest, one would like to optimize several systems, or
tasks'', simultaneously; however, in most of these scenarios the current task is determined by nature. In this work, we explore theoffline'' case in which one is able to bypass nature and choose the next task to evaluate (e.g. via a simulator). Because some tasks may be easier to optimize and others may be more critical, it is crucial to leverage algorithms that not only consider which configurations to try next, but also which tasks to make evaluations for. In this work, we describe a theoretically grounded Bayesian optimization method to tackle this problem. We also demonstrate that if the model of the reward structure does a poor job of capturing variation in difficulty between tasks, then algorithms that actively pick tasks for evaluation may end up doing more harm than good. Following this, we show how our approach can be used for real world applications in science and engineering, including optimizing tokamak controls for nuclear fusion.