Ronald Parr, Stuart Russell
We present a new approach to reinforcement learning in which the poli(cid:173) cies considered by the learning process are constrained by hierarchies of partially specified machines. This allows for the use of prior knowledge to reduce the search space and provides a framework in which knowledge can be transferred across problems and in which component solutions can be recombined to solve larger and more complicated problems. Our approach can be seen as providing a link between reinforcement learn(cid:173) ing and "behavior-based" or "teleo-reactive" approaches to control. We present provably convergent algorithms for problem-solving and learn(cid:173) ing with hierarchical machines and demonstrate their effectiveness on a problem with several thousand states.