Part of Advances in Neural Information Processing Systems 8 (NIPS 1995)
Jonathan Baxter
In this paper the problem of learning appropriate domain-specific bias is addressed. It is shown that this can be achieved by learning many related tasks from the same domain, and a theorem is given bounding the number tasks that must be learnt. A corollary of the theorem is that if the tasks are known to possess a common inter(cid:173) nal representation or preprocessing then the number of examples required per task for good generalisation when learning n tasks si(cid:173) multaneously scales like O(a + ~), where O(a) is a bound on the minimum number of examples requred to learn a single task, and O( a + b) is a bound on the number of examples required to learn each task independently. An experiment providing strong qualita(cid:173) tive support for the theoretical results is reported.