NeurIPS 2020

Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Meta Review

A very nice paper on heavy tailed bandits where only the pth moment of reward exists for p in (1,2]. Removes amount of prior information needed, removes gaps between lower and upper bounds and provides some experiments. A nice, well-rounded contribution.