Part of Advances in Neural Information Processing Systems 13 (NIPS 2000)
David Andre, Stuart Russell
We present an expressive agent design language for reinforcement learn(cid:173) ing that allows the user to constrain the policies considered by the learn(cid:173) ing process.The language includes standard features such as parameter(cid:173) ized subroutines, temporary interrupts, aborts, and memory variables, but also allows for unspecified choices in the agent program. For learning that which isn't specified, we present provably convergent learning algo(cid:173) rithms. We demonstrate by example that agent programs written in the language are concise as well as modular. This facilitates state abstraction and the transferability of learned skills.