Yoshua Bengio, Paolo Frasconi
Learning to recognize or predict sequences using long-term con(cid:173) text has many applications. However, practical and theoretical problems are found in training recurrent neural networks to per(cid:173) form tasks in which input/output dependencies span long intervals. Starting from a mathematical analysis of the problem, we consider and compare alternative algorithms and architectures on tasks for which the span of the input/output dependencies can be controlled. Results on the new algorithms show performance qualitatively su(cid:173) perior to that obtained with backpropagation.