On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems

Part of Advances in Neural Information Processing Systems 13 (NIPS 2000)

Bibtex Metadata Paper

Authors

Eiji Mizutani, James Demmel

Abstract

This paper describes a method of dogleg trust-region steps, or re(cid:173) stricted Levenberg-Marquardt steps, based on a projection pro(cid:173) cess onto the Krylov subspaces for neural networks nonlinear least squares problems. In particular, the linear conjugate gradient (CG) method works as the inner iterative algorithm for solving the lin(cid:173) earized Gauss-Newton normal equation, whereas the outer nonlin(cid:173) ear algorithm repeatedly takes so-called "Krylov-dogleg" steps, re(cid:173) lying only on matrix-vector multiplication without explicitly form(cid:173) ing the Jacobian matrix or the Gauss-Newton model Hessian. That is, our iterative dogleg algorithm can reduce both operational counts and memory space by a factor of O(n) (the number of pa(cid:173) rameters) in comparison with a direct linear-equation solver. This memory-less property is useful for large-scale problems.