Part of Advances in Neural Information Processing Systems 6 (NIPS 1993)
We show that a randomly selected N-tuple x of points ofRn with probability> 0 is such that any multi-layer percept ron with the first hidden layer composed of hi threshold logic units can imple- ment exactly 2 2:~~~ ( Nil) different dichotomies of x. If N > hin then such a perceptron must have all units of the first hidden layer fully connected to inputs. This implies the maximal capacities (in the sense of Cover) of 2n input patterns per hidden unit and 2 input patterns per synaptic weight of such networks (both capacities are achieved by networks with single hidden layer and are the same as for a single neuron). Comparing these results with recent estimates of VC-dimension we find that in contrast to the single neuron case, for sufficiently large nand hl, the VC-dimension exceeds Cover's capacity.