Part of Advances in Neural Information Processing Systems 10 (NIPS 1997)
Hans-Georg Zimmermann, Ralph Neuneier
We explain how the training data can be separated into clean informa(cid:173) tion and unexplainable noise. Analogous to the data, the neural network is separated into a time invariant structure used for forecasting, and a noisy part. We propose a unified theory connecting the optimization al(cid:173) gorithms for cleaning and learning together with algorithms that control the data noise and the parameter noise. The combined algorithm allows a data-driven local control of the liability of the network parameters and therefore an improvement in generalization. The approach is proven to be very useful at the task of forecasting the German bond market.
1 Introduction: The Observer-Observation Dilemma
Human beings believe that they are able to solve a psychological version of the Observer(cid:173) Observation Dilemma. On the one hand, they use their observations to constitute an under(cid:173) standing of the laws of the world, on the other hand, they use this understanding to evaluate the correctness of the incoming pieces of information. Of course, as everybody knows, human beings are not free from making mistakes in this psychological dilemma. We en(cid:173) counter a similar situation when we try to build a mathematical model using data. Learning relationships from the data is only one part of the model building process. Overrating this part often leads to the phenomenon of overfitting in many applications (especially in eco(cid:173) nomic forecasting). In practice, evaluation of the data is often done by external knowledge, i. e. by optimizing the model under constraints of smoothness and regularization . If we assume, that our model summerizes the best knowledge of the system to be identified, why should we not use the model itself to evaluate the correctness of the data? One approach to do this is called Clearning . In this paper, we present a unified approach of the interac(cid:173) tion between the data and a neural network (see also ). It includes a new symmetric view on the optimization algorithms, here learning and cleaning, and their control by parameter and data noise.
The Observer-Observation Dilemma in Neuro-Forecasting