Satinder Singh, Tommi Jaakkola, Michael Jordan
It is widely accepted that the use of more compact representations than lookup tables is crucial to scaling reinforcement learning (RL) algorithms to real-world problems. Unfortunately almost all of the theory of reinforcement learning assumes lookup table representa(cid:173) tions. In this paper we address the pressing issue of combining function approximation and RL, and present 1) a function approx(cid:173) imator based on a simple extension to state aggregation (a com(cid:173) monly used form of compact representation), namely soft state aggregation, 2) a theory of convergence for RL with arbitrary, but fixed, soft state aggregation, 3) a novel intuitive understanding of the effect of state aggregation on online RL, and 4) a new heuristic adaptive state aggregation algorithm that finds improved compact representations by exploiting the non-discrete nature of soft state aggregation. Preliminary empirical results are also presented.