Krzysztof M. Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Vikas Sindhwani
We present a new algorithm (ASEBO) for optimizing high-dimensional blackbox functions. ASEBO adapts to the geometry of the function and learns optimal sets of sensing directions, which are used to probe it, on-the-fly. It addresses the exploration-exploitation trade-off of blackbox optimization with expensive blackbox queries by continuously learning the bias of the lower-dimensional model used to approximate gradients of smoothings of the function via compressed sensing and contextual bandits methods. To obtain this model, it leverages techniques from the emerging theory of active subspaces in a novel ES blackbox optimization context. As a result, ASEBO learns the dynamically changing intrinsic dimensionality of the gradient space and adapts to the hardness of different stages of the optimization without external supervision. Consequently, it leads to more sample-efficient blackbox optimization than state-of-the-art algorithms. We provide theoretical results and test ASEBO advantages over other methods empirically by evaluating it on the set of reinforcement learning policy optimization tasks as well as functions from the recently open-sourced Nevergrad library.