Adaptive Soft Weight Tying using Gaussian Mixtures

Steven J. Nowlan, Geoffrey E. Hinton

Advances in Neural Information Processing Systems 4 (NIPS 1991)

One way of simplifying neural networks so they generalize better is to add an extra t.erm 10 the error fUll ction that will penalize complexit.y. \Ve propose a new penalt.y t.erm in which the dist rihution of weight values is modelled as a mixture of multiple gaussians . C nder this model, a set of weights is simple if the weights can be clustered into subsets so that the weights in each cluster have similar values . We allow the parameters of the mixture model to adapt at t.he same time as t.he network learns. Simulations demonstrate that this complexity term is more effective than previous complexity terms.