NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:8364
Title:Towards Practical Alternating Least-Squares for CCA

Reviewer 1


		
It is interesting to see that CCA is attracting attention again. Given the noticeable speed up achievable with the methods in this paper, it should be of interest to people working on (applications of) CCA. What I found missing was a better or more in-depth explanation/motivation as to where equation (4) comes from. How was it derived? Also, but this is a minor point, the lack of convergence analysis is indeed weakening this contribution. In section 6, the authors convincingly justify this but any attempt in this direction would considerably strengthen this paper.

Reviewer 2


		
The paper propose the truly and in exactly alternating least squares. Instead of approximately solving two independent linear systems, in each iteration, the algorithm solves two coupled linear systems of half the size. The submission is clearly written, and well structured. Setting up the premise, reviewing related work and motivation for the authors proposed algorithm. The mathematical derivations appear correct and algorithm runtime complexity is provided (proofs in supplementary material). The proposed algorithm and results are in my view significant, showing the ability to cut run time while retaining quality measures. The experiments section could be expanded to further detail the results. E.g. providing some views on the difference of performance (where existing methods fail to work or unable to find a solution). Likewise, in the experiments (or discussion) section, a view as to whether the proposed algorithm would work in all cases or a set of them (indication for possible future work).

Reviewer 3


		
SUMMARY: The paper presents a technical improvement to the alternating least-squares (ALS) solution to CCA. The novel method is named “Truly and Inexact Alternating Least Squares” (TALS). The idea is to solve two coupled linear systems of half size per iteration. In previous work, two independent linear systems have been solved which deteriorates the final canonical correlation between the views due to inaccurate assignment of the weights on the variables. Additionally, a faster version, FastTALS, is proposed. FastTALS applies momentum acceleration to increase the speed of the optimisation. The momentum hyperparameter requires tuning, and in case hand-tuning is not preferred, an adaptive version of FastTALS is proposed. It automatically sets the tuning hyperparameter, during optimisation. ORIGINALITY: It is a novel idea to couple together two linear systems of half size compared to the two previous approaches. In the paper, it is clearly demonstrated what the two previous approaches and the proposed method look like. The literature review is adequate and well-referenced. QUALITY: The theory presented is complete and proved. The presented method and its advanced versions are empirically assessed on three real-world datasets: Mediamill, MNIST, and JW11. The proposed methods are compared with the two earlier approaches. The performances of the methods are assessed using a (novel?) measure by computing the squared sine value of the largest principal angle between the learnt weight matrix and the ground truth obtained using MATLAB’s svds function. In other words, the performance is evaluated in terms of convergence to the ground truth U and V. The convergence to U and V is shown as a function of the time and numbers of passes through the data. More experiments are reported in the supplementary material. Overall, the proposed approach is evaluated both theoretically and empirically. CLARITY: The paper is clearly written and well-organised. The MATLAB implementations of TALS and FastTALS are accessible through an anonymous dropbox link, so the presented results can possibly be reproduced. SIGNIFICANCE: The proposed approach seems to give both a significant speed-up to ALS-based CCA and improve the accuracy of the results. For practitioners, it is important that the canonical weights are correct so that the relations between the variables can be inferred. Therefore, this approach would be recommended instead of the previous ones. *** AFTER REBUTTAL*** I have read the authors' response and other reviewers' comments. My score remains the same. ***END OF COMMENT***