Barycentric Interpolators for Continuous Space and Time Reinforcement Learning

Rémi Munos, Andrew W. Moore

Advances in Neural Information Processing Systems 11 (NIPS 1998)

In order to find the optimal control of continuous state-space and time reinforcement learning (RL) problems, we approximate the value function (VF) with a particular class of functions called the barycentric interpolators. We establish sufficient conditions under which a RL algorithm converges to the optimal VF, even when we use approximate models of the state dynamics and the reinforce(cid:173) ment functions .