NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Reviewer 1
The authors work on the distributed low-rank matrix factorization problem, while the performance of the proposed method on distributed systems is not validated by experiments. The novelty of this work is not clear. For example, the Proposition 2.1, Definition 2.1. Theorem 2.1, Theorem 2.2 and Theorem 2.3 are copied from existing works. The results in Theorem 2.4 are easy to derive and do not spell out the implication to the main result. In Theorem 2.5, the objective function will converge under the assumption that the sequence ${z(k)}$ is bounded. However, the authors do not prove when this condition will be satisfied. In the literature of MF, there have been a number of works showing that under the incoherence condition, alternating minimization applied to the factors recovers the true matrix. Such kind of statistical guarantee is missing in the work.
Reviewer 2
I've read the response and remain my score. ============================================================== 1. Originality: This paper improved the general distributed gradient descent for matrix factorization under suitable conditions. 2. Quality: The proposed algorithm is simple and easy to implement, and its analysis is comprehensive and beautiful. 3. Clarity: The statements of theorem and its preliminaries are clear. Even though I am not familiar with the technical details in this paper the main idea can be understood. 4. Significance: This paper gives a good point to generalize the standard DGD algorithm.
Reviewer 3
UPDATE AFTER REBUTTAL: Thanks for your responses. I would appreciate if you could incorporate the reposes from the rebuttal into the paper (in case of acceptance). Especially, -better highlighting of the contributions (moving some of the cited auxiliary results to the appendix might also be worth considering) -discussion of stochastic updates -for the promised discussion on alterative approaches: please check again if all primal-dual methods really require a 'star-topology', and not just a connected graph; and include references to the literature; similar with the comment on gossip averaging: the literature discusses already many variants (respective 'orders' of averaging/update steps); so please check again there as well. --- Clarity: Overall, the paper is well written and easy to follow. However, I would encourage to highlight contributions w.r.t. to previous works more clearly (e.g. pages 4-5 consists of a host of theorems cited from previous work, the transition to the new contributions is a bit vague, and also most results look like trivial extensions of these cited results, so commenting on this would help guiding the readers to the novel points). Quality: The theorems all look sound; however, only asymptotic results are provided. Further, comparisons to alternative baselines (instead of DSG) would strengthen the paper. Originality & Significance: As outlined above, the significance could potentially be estimated more favorably if discussion/comparisons to strong baselines could be added. Related work: With many recent advances on the matrix completion problem, such as e.g. also [Matrix Completion has no Spurious Local Minima, https://arxiv.org/pdf/1605.07272.pdf], the results here might not come as a complete surprise (though it is for the decentralized setting). Could the authors comment to connections to this work?