Part of Advances in Neural Information Processing Systems 6 (NIPS 1993)
O. L. Mangasarian, M. V. Solodov
The fundamental backpropagation (BP) algorithm for training ar(cid:173) tificial neural networks is cast as a deterministic nonmonotone per(cid:173) turbed gradient method. Under certain natural assumptions, such as the series of learning rates diverging while the series of their squares converging, it is established that every accumulation point of the online BP iterates is a stationary point of the BP error func(cid:173) tion. The results presented cover serial and parallel online BP, modified BP with a momentum term, and BP with weight decay.