Reviews: End to end learning and optimization on graphs

* Introduces a GNN layer that fits in between OptNet and generic GNNs where k-means is run in the forward pass and the backward pass uses implicit differentiation. The architecture is well-suited to problems with latent cluster-based structure. Also introduces differentiable relaxations of problem-specific decision-theoretic losses & a nice approximate implicit derivative that speeds up computation through the k-means fixed point. * Reviewers are generally positive, though AC wasn’t particularly happy with the depth of the reviews and went through the paper in detail. The paper is good quality and the new approximation for differentiating through k-means may have more general applicability. However AC has the following concerns about related work: 1. Hierarchical Graph Representation Learning with Differentiable Pooling (NeurIPS 2018, https://arxiv.org/abs/1806.08804). Specifically look at Eq 3-4 in this paper, where r from the submission corresponds to S mu from the submission corresponds to X. This is a simpler alternative way of producing the assignment probabilities and centroid vectors within a GNN architecture. AC would have appreciated a baseline that replaced the k-means step in the submission with this step, and then followed with the same loss function. 2. Generalized approximate graph partitioning gives a differentiable proxy of the normalized cut objective: https://arxiv.org/abs/1903.00614 3. Could also cite https://arxiv.org/abs/1803.06396 and the original Almeida-Pineda works (see abstract therein) in the context of the implicit function theorem. Should discuss around L176-189 that it’s possible to run an iterative algorithm in the backward pass to compute the implicit gradient without storing the forward pass (see Almeida-Pineda ’87). It doesn’t have the drawbacks of differentiating through the unrolled algorithm discussed there. * We discussed this in the discussion phase, and the reviewers felt it would be nice but didn't change their recommendation of acceptance. AC is ok with that, but the authors should add discussion of these works, and would encourage them to experimentally compare to 1 for the final version.

Paper ID:	2620
Title:	End to end learning and optimization on graphs