Part of Advances in Neural Information Processing Systems 26 (NIPS 2013)
Francesco Orabona
We present a new online learning algorithm that extends the exponentiated gradient to infinite dimensional spaces. Our analysis shows that the algorithm is implicitly able to estimate the L2 norm of the unknown competitor, U, achieving a regret bound of the order of O(Ulog(UT+1))√T), instead of the standard O((U2+1)√T), achievable without knowing U. For this analysis, we introduce novel tools for algorithms with time-varying regularizers, through the use of local smoothness. Through a lower bound, we also show that the algorithm is optimal up to √logT term for linear and Lipschitz losses.