Adaptive Choice of Grid and Time in Reinforcement Learning

Pareigis, Stephan

Adaptive Choice of Grid and Time in Reinforcement Learning

Stephan Pareigis

Advances in Neural Information Processing Systems 10 (NIPS 1997)

Abstract

We propose local error estimates together with algorithms for adap(cid:173) tive a-posteriori grid and time refinement in reinforcement learn(cid:173) ing. We consider a deterministic system with continuous state and time with infinite horizon discounted cost functional. For grid re(cid:173) finement we follow the procedure of numerical methods for the Bellman-equation. For time refinement we propose a new criterion, based on consistency estimates of discrete solutions of the Bellman(cid:173) equation. We demonstrate, that an optimal ratio of time to space discretization is crucial for optimal learning rates and accuracy of the approximate optimal value function.

Abstract

Name Change Policy