Part of Advances in Neural Information Processing Systems 7 (NIPS 1994)
Satinder Singh, Tommi Jaakkola, Michael Jordan
It is widely accepted that the use of more compact representations than lookup tables is crucial to scaling reinforcement learning (RL) algorithms to real-world problems. Unfortunately almost all of the theory of reinforcement learning assumes lookup table representa(cid:173) tions. In this paper we address the pressing issue of combining function approximation and RL, and present 1) a function approx(cid:173) imator based on a simple extension to state aggregation (a com(cid:173) monly used form of compact representation), namely soft state aggregation, 2) a theory of convergence for RL with arbitrary, but fixed, soft state aggregation, 3) a novel intuitive understanding of the effect of state aggregation on online RL, and 4) a new heuristic adaptive state aggregation algorithm that finds improved compact representations by exploiting the non-discrete nature of soft state aggregation. Preliminary empirical results are also presented.