MLP Can Provably Generalize Much Better than VC-bounds Indicate

Part of Advances in Neural Information Processing Systems 9 (NIPS 1996)

Bibtex Metadata Paper


Adam Kowalczyk, Herman Ferrá


Results of a study of the worst case learning curves for a partic(cid:173) ular class of probability distribution on input space to MLP with hard threshold hidden units are presented. It is shown in partic(cid:173) ular, that in the thermodynamic limit for scaling by the number of connections to the first hidden layer, although the true learning curve behaves as ~ a-I for a ~ 1, its VC-dimension based bound is trivial (= 1) and its VC-entropy bound is trivial for a ::; 6.2. It is also shown that bounds following the true learning curve can be derived from a formalism based on the density of error patterns.