Relative Loss Bounds for Multidimensional Regression Problems

Part of Advances in Neural Information Processing Systems 10 (NIPS 1997)

Bibtex Metadata Paper

Authors

Jyrki Kivinen, Manfred K. K. Warmuth

Abstract

We study on-line generalized linear regression with multidimensional outputs, i.e., neural networks with multiple output nodes but no hidden nodes. We allow at the final layer transfer functions such as the soft(cid:173) max function that need to consider the linear activations to all the output neurons. We use distance functions of a certain kind in two completely independent roles in deriving and analyzing on-line learning algorithms for such tasks. We use one distance function to define a matching loss function for the (possibly multidimensional) transfer function, which al(cid:173) lows us to generalize earlier results from one-dimensional to multidimen(cid:173) sional outputs. We use another distance function as a tool for measuring progress made by the on-line updates. This shows how previously stud(cid:173) ied algorithms such as gradient descent and exponentiated gradient fit into a common framework. We evaluate the performance of the algo(cid:173) rithms using relative loss bounds that compare the loss of the on-line algoritm to the best off-line predictor from the relevant model class, thus completely eliminating probabilistic assumptions about the data.