Part of Advances in Neural Information Processing Systems 5 (NIPS 1992)
Holm Schwarze, John Hertz
We use statistical mechanics to study generalization in large com(cid:173) mittee machines. For an architecture with nonoverlapping recep(cid:173) tive fields a replica calculation yields the generalization error in the limit of a large number of hidden units. For continuous weights the generalization error falls off asymptotically inversely proportional to Q, the number of training examples per weight. For binary weights we find a discontinuous transition from poor to perfect generalization followed by a wide region of metastability. Broken replica symmetry is found within this region at low temperatures. For a fully connected architecture the generalization error is cal(cid:173) culated within the annealed approximation. For both binary and continuous weights we find transitions from a symmetric state to one with specialized hidden units, accompanied by discontinuous drops in the generalization error.