Review for NeurIPS paper: Spin-Weighted Spherical CNNs

NeurIPS 2020

Spin-Weighted Spherical CNNs

Review 1

Summary and Contributions: This paper presents a novel and efficient spherical CNN. The idea is to use spin-weight spherical functions that were introduced in physics for gravitational waves study. The new spherical CNNs are constructed with the novel convolution with spin-weighted functions, enabling both expressive and efficient CNNs on non-Euclidean domains. The spin-weighted spherical harmonics have applications in gravitational radiation and electromagnetic theory. They are generalizations of standard spherical harmonics (when setting $s$ to zero) and more expressive. To make it model rotation easily, the equivalent convolution definition from [5] is used. The proposed approach is novel but also a little overcomplicated. Updates: After reading the authors' feedback, I think my concerns are well-addressed. As such, I raised my score and vote for acceptance.

Strengths: (1) Leveraging spin-weighted spherical functions to build spherical CNNs is novel and interesting. The proposed approach can achieve the expressivity of SO(3) convolutions and keep the efficiency of CNNs between spherical functions. (2) Using spin-weighted spherical functions can achieve equivariance over vector fields. (3) Detailed and useful background introduction is given. (4) Experiments on spherical image classification, spherical vector field classification, and spherical vector field prediction show promising performances and efficiency. (5) Examples are provided for intuitive explanation.

Weaknesses: The proposed approach is slower than [15] because of the forward and backward Fourier transforms. This is acceptable though. The running time of [7] is not shown in Table 1. Source code is not provided. I am wondering if you could show us some results on the same tasks shown in [15], such as results on 3d object classification (ModelNet40) and 3D object retrieval ( ShapeNet Core55), making them comparable.

Correctness: I did not make a full check on the maths.

Clarity: The paper is well written and organized. It is easy to follow overall but I think it would be better to give more intuitive explanations.

Relation to Prior Work: Relations to previous contributions such as equivariant CNNs, spherical CNNs, and equivariant vector fields are discussed. Here are some missing related works: SPHERICAL CNNS ON UNSTRUCTURED GRIDS (ICLR 2019)

Reproducibility: Yes

Additional Feedback: 1. Are there possible solutions to further improve the efficiency to match that of [15]? The authors claim the efficiency of the proposed approach several times, but from the complexity analysis, the approach is slower than [15], right? 2. The running time of some baselines is missing. 3. In equation (2), the number of band is usually limited, right? So it is not infinite sum. 4. The authors use the equivariant convolutions definition (12) from Boyle, but this definition is not true when transformation is more complicated as stated in [5]. Will it bring any issues to the model? 5. I also expect some results on the same tasks as shown in [15]. 6. What are the benefits of spin basis compared to Wigner D-matrices?

Review 2

Summary and Contributions: This work designed a spin-weighted Spherical CNN kernel with several properties: has more expressive representations than scalar CNN. enable anisotropic filters. friendly for vector field-related tasks.

Strengths: Define a novel way for convolution that handles vector fields, and outperform existing methods on traditional tasks.

Weaknesses: It is hard for me to fully understand the mathematics. I feel that authors should expose more details on the part of SWSH. If space is not enough, consider shorten the spherical harmonics since that's more popular to readers. Many variable/notations are just written without clear definition, which makes readers hard to get access to. From my understanding, the authors define a new and powerful kernel for rotationally equivariant CNNs. However, rotation equivariance is not a good property for many data, e.g., spherical image is usually with a fixed up/gravity vector. Vector fields are commonly studied on sphere like the earth where poles are clearly defined. These issues are properly addressed by "Jiang, Chiyu, et al. "Spherical CNNs on unstructured grids." arXiv preprint arXiv:1901.02039 (2019)." However, the important related work is missing without comparison. Another weakness is again about the experiment. It only plays with MNIST, which is insufficient demonstrate the usefulness of the methods. Therefore, I highly recommend the authors to perform experiments on real data (like what "Spherical CNNs on unstructured grids.", and compare with them).

Correctness: Seemingly correct after I spend effort checking mathematics background.

Clarity: The introduction is fine. The fundamental mathematics are too brief for readers to understand.

Relation to Prior Work: Missing related works: 1. Jiang, Chiyu, et al. "Spherical CNNs on unstructured grids." arXiv preprint arXiv:1901.02039 (2019). 2. Huang, Jingwei, et al. "Texturenet: Consistent local parametrizations for learning from high-resolution signals on meshes." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.

Reproducibility: Yes

Additional Feedback: The authors mentioned climate data. Please provide experiments (check Jiang et al.) to make it more convincing.

Review 3

Summary and Contributions: Recent work on spherical CNNs can be broadly classified into two categories: One that lifts the functions on the sphere (both the input image and the filter) onto the group SO(3), where the convolution is performed. Another that operates fully on the sphere, however such networks don't allow for anisotropic filters and are thus of limited expressivity. The main contribution of this paper is to propose spherical CNNs that allow for anistropic filters, never leave the spherical domain and thus are much more efficient to implement while giving good experimental performance. Towards this end, the authors employ spin-weighted spherical harmonics (a generalization of spherical harmonics, that suffer a phase change when the sphere is subject to a rotation) for performing the convolution in the spectral domain. The SWSF's allow for more expressive representations than the usual scalar spherical functions. In fact SWSFs are gauge fields which take values in a line bundle over the sphere. To give a complete specification for the CNN, analogs for non-linearities, batch-normalization and pooling layers are defined. Finally, as mentioned above, since the SWSFs can be interpreted as equivariant vector fields, the method proposed is useful when the input/outputs are vector fields. One could think of the proposed method as the analog of "Harmonic Networks" of Worrall et al, but where the functions are now defined on a sphere. Section 3 covers the background. First covering the usual spherical harmonics formulation, followed by Spin-Weighted spherical harmonics. From the formulation it can also be seen that that SWSHs can be seen as functions on the rotation group with a sparse spectrum. This also gives a good intuition on how the proposed method lies somewhere in between the two approaches described above. Section 4 defines the usual notion of cross-correlation but now using the SWSHs and shows clearly its equivariance properties. The following section provides an analog of batch norm, gives the non-linearities that are equivariant. The experiments results compare to the existing Spherical CNNs and exhibits superior performance. In summary the main contribution of the paper is to write the spherical convolution using the SWSHs, allowing for working with inputs and outputs that are vector fields, anisotropic filters, and always operating on the sphere. The results are improved accuracy and reduced computational costs.

Strengths: The main claim of the paper is that using the spin-weighted spherical harmonics in place of the spherical harmonics allows for modeling more powerful spherical CNNs. In particular as they allow for using anisotropic filters and always operating on the sphere, thus also providing a more efficient spherical CNN. In the above sense the contribution of the paper is only a change in how the convolution in spherical CNNs is performed. However the formulation is elegant and gives an immediate boost in performance.

Weaknesses: There are three sets of experiments in the paper each of which operates on variations on a spherical MNIST. This can be seen as a negative. As in this case it is not possible to fully infer the computational benefits that are used as a motivation in the beginning. Some examples of non-trivial data would include: CMB (Cosmic Microwave Background) data, predicting polarization (since that is a vector field) or publicly available geophysical data that involve input data that are magnetic fields over the earth.

Correctness: Yes.

Clarity: Yes overall it is well written. Some minor comments: Section 2: Line 75: "Cohen and Welling [11] formalize these models and name them group convolutional neural networks (G-CNNs)". It would be preferable to write "formalized these models and named them..". Same on page 96: "Cohen et al. introduce.."

Relation to Prior Work: Relation to prior work is well documented. One suggestion: The author might want to cite https://openreview.net/forum?id=HJeYSxHFDS this also allows for anisotropic filters on the spheres, and is gauge equivariant. However this work is not online on arXiv and does not have a doi. Moreover, openreview is not indexed. Therefore, it is entirely on the author's discretion if they would like to cite it. I wanted to bring this to their notice.

Reproducibility: Yes

Additional Feedback: Update after feedback: Thanks to the authors for addressing some of the concerns raised. The additional experiments are welcome and will strengthen the paper.

Review 4

Summary and Contributions: This paper introduced the spin-weighted spherical CNNs, a new type of spherical CNN that allows anisotropic filters in an efficient way, without ever leaving the spherical domain. The key idea is to consider spin-weighted spherical functions, which were introduced in physics in the study of gravitational waves. These are complex-valued functions on the sphere whose phases change upon rotation. It uses sets of spin-weighted spherical functions as features and filters, and employ layers of a newly introduced spin-weighted spherical convolution to process spherical images or spherical vector fields. Experiments show that the proposed method outperforms the isotropic spherical CNNs while still being much more efficient than using SO(3) convolutions. Our model achieves superior performance on the tasks attempted, at a reasonable computational cost.

Strengths: Overall this is a strong paper with very solid technical foundation. Although the idea is originally from physics but its application in computer vision is novel. The performance seems outperform existing state of arts

Weaknesses: As a reviewer not very familiar with the technical part, it seems hard to understand all the mathematical formulas in the paper. It would be better if the paper can provide more self contained description so that ordinary user can follow better. For example in Table 3, NR, R is not explained. Also the three examples given in the paper seems relatively idealized. It would be better if it can show some more real applications. I do understand this paper is more focused on the theory foundations.

Correctness: Yes

Clarity: Yes

Relation to Prior Work: Yes

Reproducibility: Yes

Additional Feedback: I will keep my ratings after the rebuttal.