Part of Advances in Neural Information Processing Systems 4 (NIPS 1991)
I present a modular network architecture and a learning algorithm based on incremental dynamic programming that allows a single learning agent to learn to solve multiple Markovian decision tasks (MDTs) with signif(cid:173) icant transfer of learning across the tasks. I consider a class of MDTs, called composite tasks, formed by temporally concatenating a number of simpler, elemental MDTs. The architecture is trained on a set of compos(cid:173) ite and elemental MDTs. The temporal structure of a composite task is assumed to be unknown and the architecture learns to produce a tempo(cid:173) ral decomposition. It is shown that under certain conditions the solution of a composite MDT can be constructed by computationally inexpensive modifications of the solutions of its constituent elemental MDTs.