NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 4900 A General Theory of Equivariant CNNs on Homogeneous Spaces

### Reviewer 2

Having read the authors' feedback and other reviews, I am increasing my score from 5 to 6. If the paper is accepted, I would ask the authors to dedicate more space to worked examples and to differentiate their work from the existing literature in more detail. -------------------------------------------------------- In terms of novelty over previous works on equivariant CNNs, this paper is a mild step forward, bringing in sections of associated vector bundles for feature spaces and allowing for general representations there. The clarity of this mathematical language is admittedly nice and I think it will help researchers think about general equivariant CNNs in the future. However, Section 8 did not do a sufficient job of clarifying what the new theory does that previous papers could not, in terms of relevant examples. The main theorem is a classification of equivariant linear maps between such feature spaces. The organization of the paper is probably not optimal for NeurIPS, with the body of the paper Sections 2-6 reviewing general mathematical constructions. Some of this material could presumably be relegated to appendices, e.g. the proofs in Section 6, leaving more space for improved discussion.

### Reviewer 3

The paper studies the following problem: If we consider a base space that admits a transitive action, and if the feature maps in neural network layers operating on this space are fields, then what is the most general way to write equivariant linear maps between the layers? The first contribution of the paper is to state and prove a theorem that says that such a linear map is a cross-correlation/convolution with a specially constrained kernel, which is called an equivariant kernel. The proof follows in a very straightforward manner from the application of MacKay's theorem aka Frobenius reciprocity, which essentially describes how induction and restriction interact with one another. Turns out that this is precisely the language needed to describe the equivariant networks talked about in this paper (and implicitly in many experimental papers). The proof is elegant and natural, and no details are omitted. Next, in a somewhat abstract manner it is also describes how such constrained kernels will look like. This to me personally is the most useful, as for practitioners, it gives a systematic procedure to derive the right kernel for convolution. This is also useful in different ways -- for example there has been recent work that posits that for continuous groups it is perhaps useful to always operate in Fourier space. To enforce locality we then need an appropriate notion of wavelets. The two approaches are equivalent, but I find the approach presented in the paper more transparent vis a vis jointly enorcing locality and equivariance. Appropriate equivariant non-linearities are also described. Lastly useful examples are given re spherical CNNs, SE(3) steerable CNNs that do a good job in making the discussion a bit more concrete (although still in the abstract space :)

### Reviewer 4

This is an "emergency review" therefore it might be shorter than the norm. There is a substantial literature on steerability in the classical image processing domain. Recently, it has become clear that generalizing steerability to the action groups other than SE(2) is important for constructing certain classes of neural networks. Steerability can be described in different ways, some more abstract than others. This paper uses the language of fiber bundles, which is beautiful and enlightening but somewhat abstract. The paper makes no apologies about being abstract. I can understand that to somebody who comes more from the applications side of ML rather than the mathematical side it might be difficult to digest. On the other hand, it also states that "This paper does not contain fundamentally new mathematics (in the sense that a professional mathematician with expertise in the relevant subjects would not be surprised by our results)." I like this honesty. In actual fact, I don't think that either of the above detract from the value of the paper. This paper forms an important bridge between the neural nets literature and certain branches of algebra. I found it very enlightening. I appreciate the effort that the authors have made to show how a wide range of other works fit in their framework. I also think that the exposition is very straight forward. It does not attempt to gloss over or hide any of the underlying mathematical concepts. At the same time, it avoids getting bogged down with mathematical minutiae or a long list of definitions. The authors clearly made an attempt to say things in a way that is "as simple as possible but not simpler." It is quite an achievement to expose all the concepts that they need in 8 pages.