Part of Advances in Neural Information Processing Systems 28 (NIPS 2015)
Vikas Sindhwani, Tara Sainath, Sanjiv Kumar
We consider the task of building compact deep learning pipelines suitable for deploymenton storage and power constrained mobile devices. We propose a uni-fied framework to learn a broad family of structured parameter matrices that arecharacterized by the notion of low displacement rank. Our structured transformsadmit fast function and gradient evaluation, and span a rich range of parametersharing configurations whose statistical modeling capacity can be explicitly tunedalong a continuum from structured to unstructured. Experimental results showthat these transforms can significantly accelerate inference and forward/backwardpasses during training, and offer superior accuracy-compactness-speed tradeoffsin comparison to a number of existing techniques. In keyword spotting applicationsin mobile speech recognition, our methods are much more effective thanstandard linear low-rank bottleneck layers and nearly retain the performance ofstate of the art models, while providing more than 3.5-fold compression.