Part of Advances in Neural Information Processing Systems 8 (NIPS 1995)
Pascal Koiran, Eduardo Sontag
This paper shows that neural networks which use continuous acti(cid:173) vation functions have VC dimension at least as large as the square of the number of weights w. This result settles a long-standing open question, namely whether the well-known O( w log w) bound, known for hard-threshold nets, also held for more general sigmoidal nets. Implications for the number of samples needed for valid gen(cid:173) eralization are discussed.