Review for NeurIPS paper: On the equivalence of molecular graph convolution and molecular wave function with poor basis set

NeurIPS 2020

On the equivalence of molecular graph convolution and molecular wave function with poor basis set

Review 1

Summary and Contributions: This paper makes three contributions: 1. An opinion piece that can be summarized as a call for familiarization with application domain knowledge for ML researchers tackling application problems 2. A comparison between an LCAO wave function model of a molecule and a representation of it in a graph convolutional network. 3. An energy-prediction neural network called QDF, based on LCAO evaluated in the atom locations followed by two fully connected neural networks, one predicting energy, one predicting potential in order to encourage a Hohenberg-Kohn-type density-to-potential one-to-one mapping.

Strengths: It is true that the field of chemistry does not benefit from chemistry ML research performed in the vacuum, leading to e.g. molecule generative models making 18-atom aromatic rings. It is good for ML researchers to know the language and objectives of their field of application. The LCAO-based QDF does a good job at predicting molecular properties for atom numbers outside the range of molecule sizes in the training set, compared to a previously published GCN.

Weaknesses: There are worrying imprecisions in many places of the manuscript. LCAO is not really a basic assumption (l38), but a first-order approximation - notably it is the solution to the Born-Oppenheimer-approximated Schrödinger equation at infinite distance between atoms. LCAO being an approximation means that it has limits and requires corrections, which end up being nonlinear. Criticizing GCN for being too nonlinear seems out of place in that regard -- LCAO is not nonlinear enough in the same sense. l57 states that one can conclude from the fact that the Hohenberg-Kohn map is nonlinear that one can model it with a simple feedforward DNN, which is not an adequate conclusion without evidence. l61/62 state that the proposed QDF model competes with SchNet on the qm9 energy prediction task, but no results on full qm9 are listed anywhere. The comparison of LCAO to GCN seems somewhat forced. The proposed model could be presented completely without this comparison and not much information would be lost. The descriptions of the LCAO and QDF network approaches conspicuously lack a harmonic term. Is this actually omitted in this work? The polynomial term is radial so it cannot absorb spherical harmonics into a cartesian polynomial. Omitting spherical harmonics does not do the orbital functions justice and will results in lack of precision. Consider adding these if they are not present. If they are present, why are they omitted in the manuscript? The present work is not the first to show that simple functionals of atom position involving orbital-like functions followed by a simple regression can compete with state-of-the-art neural network approaches for energy prediction. See e.g. [Eikenberg 2018, Solid Harmonic Wavelet Scattering for Predictions of Molecule Properties]. While it is laudable that the present work proposes a new task on QM9, namely to split the dataset according to molecule sizes, it neglects several subtleties. QM9 is called QM9 because it is a dataset of molecules with up to 9 heavy atoms (counting atoms excluding hydrogen). Splitting the dataset at atom count instead of heavy-atom count will lead to a diverse mix of heavy-atom counts being on either side of the split. For the purposes of many property prediction tasks in non-ionized and stable molecules, the hydrogen atoms are subsumed and thus irrelevant for the prediction of the property, so in effect the proposed split does not actually lead to extrapolation. An evaluation of splitting according to heavy atoms was proposed in [Eikenberg 2018] and suggested as a general benchmark. Additionally and unfortunately, the QM9 dataset does not represent all heavy-atom-counts with the same frequency. It would be a good idea to propose several dataset splits and not only one.

Correctness: See weakness. Several statements are imprecise.

Clarity: The paper lapses into imprecision at several points, often under the guise of simplifying things for the ML researcher.

Relation to Prior Work: See weaknesses. Some prior work modeling molecule properties using simple Gaussian-type orbitals followed by regression was not taken into account. It also avoids comparison to previous methods on any of the original QM9 tasks.

Reproducibility: No

Additional Feedback: It is questionable whether the repeated insinuations that ML researchers have no understanding at all about the subject matter they are working on are well-placed in this context. For ML researchers it is generally advisable to be informed about their domain of application and I believe most ML researchers take this to heart. If there is a pervasive feeling that this is not the case for computational chemistry, then it might be a good idea to publish an opinion piece into a visible location. Admittedly, the Neurips community might be exactly the right addressee for this (but a Neurips paper might not be the right vehicle). However, it is entirely unclear to me who exactly is meant by “the ML community” from the point of view of the authors. The authors explicitly exclude [24, 23] (see line 250), which to me are central machine learning works in chemistry. An elucidation of who is meant by “the ML community” would be helpful information. Minor: “Jones” should be cited as “Lennard-Jones”

Review 2

Summary and Contributions: This paper compares different deep learning approaches to modeling the quantum mechanic properties of molecules, and presents a model that incorporates multiple ideas from physics including (1) the linear combination of atomic orbitals to obtain molecular orbitals; (2) a constraint on the external potential function. While none of these ideas are new in themselves, the authors provide a clear and useful analysis describing the differences between modeling approaches (i.e. graph convolution networks and density functional theory models from physics), and provide a particular model that captures important physics intuition and does well in experiments.

Strengths: There are many ways to use deep learning to model data with complex structure, such as graphs and molecules, and understanding the relevant issues is important NeurIPS community, especially consider how much of the work on graph neural networks has been motivated by applications to small molecules. This work attempts to tease apart some of the subtle (but important) differences in these models for a particular physics-oriented application, but I think the lessons are relevant beyond the particular scope of this application. The authors provide compelling theoretical grounding and empirical evaluation.

Weaknesses: The proposed method is only compared to alternatives on a single task.

Correctness: I had no complaints.

Clarity: The paper is clear and well written.

Relation to Prior Work: Yes.

Reproducibility: Yes

Additional Feedback:

Review 3

Summary and Contributions: The paper studies the link between graph convolutional networks (GCNs) trained on molecules for the prediction of molecular energies and the linear combination of atomic orbitals (LCAO) method from quantum chemistry. In doing so, the authors contribute the following: 1. Parallels between the two approaches. 2. Differences between the two approaches, and potential shortcomings of GCNs for molecules. 3. Based on the two previous points, a new method which the authors refer to as quantum deep field (QDF). 4. Empirical evidence of the efficacy of the QDF model based on an extrapolation task, not merely interpolation. 5. The authors also advocate for new benchmarks in machine learning for physical sciences in which models are judged by their extrapolation capabilities, which the authors suggest is a better metric to determine whether the machine learned model learned underlying physical principles or simply fit non-physical patterns within the training set.

Strengths: 1. The link between GCNs and the LCAO method provides insight into what GCNs for molecules get right, and what they get wrong. In light of the referenced work on the difficulty of improving GCN performance with depth, the comparison and subsequent arguments presented by the authors are compelling. For anyone in machine learning designing GCNs specifically with molecules in mind (seemingly a non-trivial subset of graph neural network papers given their use in chemistry and drug design, among other areas), the insights provided in this paper are highly relevant. 2. Based on these arguments, the architecture for the QDF is well motivated. I found particularly interesting the way in which the physical constraints are imposed via the Hohenberg-Kohn theorem/map, which seemed to have significant novelty over standard physical loss constraints in many other papers. 3. The empirical results, while constrained to QM9, are compelling, particularly the extrapolation capabilities of the QDF method compared to the GCN. 4. I am sympathetic to the authors advocacy for judging machine learned models for physical sciences by their extrapolation capability, and think this point is valid for anyone in the NeurIPS community applying machine learning to physical problems.

Weaknesses: 1. In the larger field of GCNs generally, not specifically for molecules, the insights presented in this paper are perhaps of limited interest. 2. I would have liked to see further discussion on the computational complexity of the proposed QDF. While the QDF has significantly fewer learned parameters than, for example, SchNet, how easy/hard is it to train the QDF model and then once trained, to evaluate the QDF on a new molecule? Two particular areas for clarification come to mind: (i) the “alternating” training of the energy DNN versus the HK DNN; and (ii) the computational complexity associated to needing to use a grid with G grid points to evaluate the vector-format atomic orbital.

Correctness: Everything appears to be correct, and the supplementary material provides sufficient detail on the numerical experiments.

Clarity: I found the paper to be very well written. The only exception to this was the last paragraph of Section 2 that contains equation (8). The reasoning for this correct description of the LCAO versus what came before in Section 2 was not quite clear to me, and given that it is an important aspect of the arguments in Section 3.5 some additional clarification would be beneficial. Specifically, why is the dimensionality the same as the number of basis functions, and what do these dimensions represent?

Relation to Prior Work: With regards to GCNs for molecules, the paper very clearly describes how their work compares to such methods, as indeed this is the main point of the paper. However, the QDF method can be more broadly placed in the genre of “machine learning for quantum chemistry,” of which not all methods are GCNs. Comparison to such methods, particularly those which constrain their models through physical considerations (invariants, laws, rules, etc), and in particular the extrapolation capabilities of those models (or lack thereof), would have further strengthened the arguments presented in this paper.

Reproducibility: Yes

Additional Feedback: Regarding the broader impacts, I marked partially. Here are my comments: Broader impacts to drug design, material science, chemistry are briefly mentioned. The broader impact to GCN design for molecules is clearly discussed. Ethical and societal implications are not discussed, but that would seemingly be beyond the scope of this work. UPDATE TO REVIEW: I have read the other reviews and the authors’ feedback, and after much discussion among the reviewers, I am updating my review as follows. Overall my view of the paper is still favorable. Review #2 does raise some good points, though, and the author feedback raises some additional concerns. In particular: * Review #2 pointed out that the extrapolation task is not really chemical extrapolation. To the authors’ credit, they did new numerical experiments during the rebuttal period in which the molecules were split by the number of heavy atoms, not the total number of atoms. The results the authors get are good, however, they are about the same as the results reported in [Eikenberg 2018] (referenced in Reviewer #2’s review) for the same task.  * The fact that the computational complexity of the method prohibits training on the full QM9 dataset is worrisome for the long term outlook of the proposed method. Indeed, there are competing methods, such as the 2017 paper introducing ANI-1 [https://doi.org/10.1039/C6SC05720A], that are capable of absorbing millions of molecular conformations into their training set, and which are thus capable of learning complex patterns in chemical compound space.  Having said that, I still think the paper has these positive points: * A novel approach, as illustrated in Figure 1. * An interesting contrast between the proposed approach and GCN approaches. * Compelling numerical results in comparison to a standard GCN approach as illustrated in Figure 4. Even if this task is not chemical extrapolation, it still shows the benefit of the proposed approach over GCNs. * Discussion of extrapolation metrics in the ML for chemistry field, and the desire to be able to learn physical models from data.  Balancing these considerations, I am revising my overall score from 8 to 7. While this slightly lowers my overall score, I am still solidly in favor of accepting the paper.