Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The paper proposes efficient methods for computing the Tucker decomposition of higher-order tensors. The problem is a hard, basic problem in numerical linear algebra with reasonably wide applicability. Tensor decompositions have played an important role in a variety of machine learning applications, see for example: Anandkumar et al “Tensor Decompositions for Learning Latent Variable Models” JMLR 2014; Novikov et al “Tensorizing Neural Networks” NeurIPS 2015, which used tensor decompositions to massively compress the dense layers of VGG; Moitra and Wein “Spectral Methods from Tensor Networks”; and Becker and Osman “Low rank Tucker decompositions of large tensors using tensorsketch” NeurIPS 2018. Singleshot is a coordinate descent based algorithm which applies gradient updates to variables in the Tucker decomposition, which it cycles over. The paper carefully considers the memory usage of Singleshot (and its variants) since tensor computations are often extremely memory intensive. The paper proposes two related algorithms, Singleshotinexact and Singleshotonline, which reduce the computational complexity at the expense of some accuracy and compute tensor decompositions in a single pass respectively. The paper also introduces “positive” variants of the Singleshot and Singleshotinexact, for computing non-negative tensor decompositions. Although this sounds like a long laundry list of algorithms, they are all simple and natural variants on a basic theme. The analysis and algorithms hang together well. The supplementary material of 27 pages includes: an extensive analysis of Singleshot and its variants; additional experiments including results that show Singleshot’s runtime scales better with dimension than Tensorsketchonline; and a detailed discussion of the space and time complexity of the three variants of Singleshot. Section 1 of the Supp has stand-alone value as a useful reference on tensor computations. The paper tackles an important question, proposes a simple and extremely flexible (family of) algorithms, presents a detailed analysis of them, and finally presents compelling experimental results. While there could always be more experiments, more motivation, and more everything, IMO this paper is well above the bar for acceptance at NeurIPS. One of the reviewers suggested the paper is better suited to a numerical methods journal. I agree -- but it is up to the authors to choose their audience. And presumably this work will be submitted to a suitable journal soon as well.