{"title": "Cormorant: Covariant Molecular Neural Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 14537, "page_last": 14546, "abstract": "We propose Cormorant, a rotationally covariant neural network architecture for learning the behavior and properties of complex many-body physical systems. We apply these networks to molecular systems with two goals: learning atomic potential energy surfaces for use in Molecular Dynamics simulations, and learning ground state properties of molecules calculated by Density Functional Theory. Some of the key features of our network are that (a) each neuron explicitly corresponds to a subset of atoms; (b) the activation of each neuron is covariant to rotations, ensuring that overall the network is fully rotationally invariant. Furthermore, the non-linearity in our network is based upon tensor products and the Clebsch-Gordan decomposition, allowing the network to operate entirely in Fourier space. Cormorant significantly outperforms competing algorithms in learning molecular Potential Energy Surfaces from conformational geometries in the MD-17 dataset, and is competitive with other methods at learning geometric, energetic, electronic, and thermodynamic properties of molecules on the GDB-9 dataset.", "full_text": "Cormorant: Covariant Molecular Neural Networks\n\nBrandon Anderson\u2217\u2021, Truong-Son Hy\u2217 and Risi Kondor\u2217\u2020(cid:93)\n\u2217Department of Computer Science, \u2020Department of Statistics\n\nThe University of Chicago\n\n(cid:93) Center for Computational Mathematics, Flatiron Institute\n\n\u2021 Atomwise\n\n{hytruongson,risi}@uchicago.edu\n\nbrandona@jfi.uchicago.edu\n\nAbstract\n\nWe propose Cormorant, a rotationally covariant neural network architecture for\nlearning the behavior and properties of complex many-body physical systems.\nWe apply these networks to molecular systems with two goals: learning atomic\npotential energy surfaces for use in Molecular Dynamics simulations, and learn-\ning ground state properties of molecules calculated by Density Functional Theory.\nSome of the key features of our network are that (a) each neuron explicitly corre-\nsponds to a subset of atoms; (b) the activation of each neuron is covariant to rota-\ntions, ensuring that overall the network is fully rotationally invariant. Furthermore,\nthe non-linearity in our network is based upon tensor products and the Clebsch-\nGordan decomposition, allowing the network to operate entirely in Fourier space.\nCormorant signi\ufb01cantly outperforms competing algorithms in learning molecular\nPotential Energy Surfaces from conformational geometries in the MD-17 dataset,\nand is competitive with other methods at learning geometric, energetic, electronic,\nand thermodynamic properties of molecules on the GDB-9 dataset.\n\n1\n\nIntroduction\n\nIn principle, quantum mechanics provides a perfect description of the forces governing the behavior\nof atoms, molecules and crystalline materials such as metals. However, for systems larger than a\nfew dozen atoms, solving the Schr\u00f6dinger equation explicitly at every timestep is not a feasible\nproposition on present day computers. Even Density Functional Theory (DFT) [Hohenberg and\nKohn, 1964], a widely used approximation to the equations of quantum mechanics, has trouble\nscaling to more than a few hundred atoms.\nConsequently, the majority of practical work in molecular dynamics today falls back on fundamen-\ntally classical models, where the atoms are essentially treated as solid balls and the forces between\nthem are given by pre-de\ufb01ned formulae called atomic force \ufb01elds or empirical potentials, such as\nthe CHARMM family of models [Brooks et al., 1983, 2009]. There has been a widespread real-\nization that this approach has inherent limitations, so in recent years a burgeoning community has\nformed around trying to use machine learning to learn more descriptive force \ufb01elds directly from\nDFT computations [Behler and Parrinello, 2007, Bart\u00f3k et al., 2010, Rupp et al., 2012, Shapeev,\n2015, Chmiela et al., 2016, Zhang et al., 2018, Sch\u00fctt et al., 2017, Hirn et al., 2017]. More broadly,\nthere is considerable interest in using ML methods not just for learning force \ufb01elds, but also for\npredicting many other physical/chemical properties of atomic systems across different branches of\nmaterials science, chemistry and pharmacology [Montavon et al., 2013, Gilmer et al., 2017, Smith\net al., 2017, Yao et al., 2018].\nAt the same time, there have been signi\ufb01cant advances in our understanding of the equivariance\nand covariance properties of neural networks, starting with [Cohen and Welling, 2016a,b] in the\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fcontext of traditional convolutional neural nets (CNNs). Similar ideas underly generalizations of\nCNNs to manifolds [Masci et al., 2015, Monti et al., 2016, Bronstein et al., 2017] and graphs [Bruna\net al., 2014, Henaff et al., 2015]. In the context of CNNs on the sphere, Cohen et al. [2018] real-\nized the advantage of using \u201cFourier space\u201d activations, i.e., expressing the activations of neurons\nin a basis de\ufb01ned by the irreducible representations of the underlying symmetry group (see also\n[Esteves et al., 2017]), and these ideas were later generalized to the entire SE(3) group [Weiler\net al., 2018]. Kondor and Trivedi [2018] gave a complete characterization of what operations are\nallowable in Fourier space neural networks to preserve covariance, and Cohen et al generalized the\nframework even further to arbitrary gauge \ufb01elds [Cohen et al., 2019]. There have also been some\nrecent works where even the nonlinear part of the neural network\u2019s operation is performed in Fourier\nspace: independently of each other [Thomas et al., 2018] and [Kondor, 2018] were to \ufb01rst to use\nthe Clebsch\u2013Gordan transform inside rotationally covariant neural networks for learning physical\nsystems, while [Kondor et al., 2018] showed that in spherical CNNs the Clebsch\u2013Gordan transform\nis suf\ufb01cient to serve as the sole source of nonlinearity.\nThe Cormorant neural network architecture proposed in the present paper combines some of the\ninsights gained from the various force \ufb01eld and potential learning efforts with the emerging theory\nof Fourier space covariant/equivariant neural networks. The important point that we stress in the\nfollowing pages is that by setting up the network in such a way that each neuron corresponds to\nan actual set of physical atoms, and that each activation is covariant to symmetries (rotation and\ntranslation), we get a network in which the \u201claws\u201d that individual neurons learn resemble known\nphysical interactions. Our experiments show that this generality pays off in terms of performance\non standard benchmark datasets.\n\n2 The nature of physical interactions in molecules\n\nUltimately interactions in molecular systems arise from the quantum structure of electron clouds\naround constituent atoms. However, from a chemical point of view, effective atom-atom interactions\nbreak down into a few simple classes based upon symmetry. Here we review a few of these classes in\nthe context of the multipole expansion, whose structure will inform the design of our neural network.\n\nScalar interactions. The simplest type of physical interaction is that between two particles that\nare pointlike and have no internal directional degrees of freedom, such as spin or dipole moments.\nA classical example is the electrostatic attraction/repulsion between two charges described by the\nCoulomb energy\n\nHere qA and qB are the charges of the two particles, rA and rB are their position vectors, rAB =\nrA\u2212 rB, and \u00010 is a universal constant. Note that this equation already re\ufb02ects symmetries: the fact\nthat (1) only depends on the length of rAB and not its direction or the position vectors individually\nguarantees that the potential is invariant under both translations and rotations.\n\nDipole/dipole interactions. One step up from the scalar case is the interaction between two\ndipoles. In general, the electrostatic dipole moment of a set of N charged particles relative to their\ncenter of mass r is just the \ufb01rst moment of their position vectors weighted by their charges:\n\nThe dipole/dipole contribution to the electrostatic potential energy between two sets of particles A\nand B separated by a vector rAB is then given by\n\n(cid:20) \u00b5A \u00b7 \u00b5B\n|rAB|3 \u2212 3\n\nVd/d =\n\n1\n\n4\u03c0\u00010\n\n(cid:21)\n\n.\n\n(\u00b5A \u00b7 rAB)(\u00b5B \u00b7 rAB)\n\n|rAB|5\n\nOne reason why dipole/dipole interactions are indispensible for capturing the energetics of\nmolecules is that most chemical bonds are polarized. However, dipole/dipole interactions also occur\nin other contexts, such as the interaction between the magnetic spins of electrons.\n\n2\n\nVC = \u2212 1\n4\u03c0\u00010\n\nqAqB\n|rAB| .\n\nN(cid:88)\n\ni=1\n\n\u00b5 =\n\nqi(ri \u2212 r).\n\n(1)\n\n(2)\n\n\fQuadropole/quadropole interactions. One more step up the multipole hierarchy is the interac-\ntion between quadropole moments. In the electrostatic case, the quadropole moment is the second\nmoment of the charge density (corrected to remove the trace), described by the matrix\n\nN(cid:88)\n\ni=1\n\n\u0398 =\n\nqi(3rir(cid:62)\n\ni \u2212 |ri|2 I).\n\nQuadropole/quadropole interactions appear for example when describing the interaction between\nbenzene rings, but the general formula for the corresponding potential is quite complicated. As a\nsimpli\ufb01cation, let us only consider the special case when in some coordinate system aligned with\nthe structure of A, and at polar angle (\u03b8A, \u03c6A) relative to the vector rAB connecting A and B, \u0398A\ncan be transformed into a form that is diagonal, with [\u0398A]zz = \u03d1A and [\u0398A]xx = [\u0398A]yy = \u2212\u03d1A/2\n[Stone, 1997]. We make a similar assumption about the quadropole moment of B. In this case the\ninteraction energy becomes\n\nVq/q =\n\n3\n4\n\n\u03d1A\u03d1B\n\n4\u03c0\u00010 |rAB|5\n\nA \u22125 cos2 \u03b8B \u2212 15 cos2 \u03b8A cos2 \u03b8B+\n\n2(4 cos \u03b8A\u03b8B \u2212 sin \u03b8A sin \u03b8B cos(\u03c6A\u2212 \u03c6B))2(cid:3).\n\n(3)\nHigher order interactions involve moment tensors of order 3,4,5, and so on. One can appreciate that\nthe corresponding formulae, especially when considering not just electrostatics but other types of\ninteractions as well (dispersion, exchange interaction, etc), quickly become very involved.\n\n(cid:2)1 \u2212 5 cos\u03b8\n\n3 Spherical tensors and representation theory\n\nFortunately, there is an alternative formalism for expressing molecular interactions, that of spherical\ntensors, which makes the general form of physically allowable interactions more transparent. This\nformalism also forms the basis of the our Cormorant networks described in the next section.\nThe key to spherical tensors is understanding how physical quantities transform under rotations.\nSpeci\ufb01cally, in our case, under a rotation R:\n\nq (cid:55)\u2212\u2192 q\n\n\u0398 (cid:55)\u2212\u2192 R \u0398R(cid:62)\n\nrAB (cid:55)\u2212\u2192 R rAB.\n\n\u00b5 (cid:55)\u2212\u2192 R \u00b5\nFlattening \u0398 into a vector \u0398 \u2208 R9,\nits transformation rule can equivalently be written as\n\u0398 (cid:55)\u2192 (R\u2297 R) \u0398, showing its similarity to the other three cases. In general, a k\u2019th order Carte-\nsian moment tensor T (k) \u2208 R3\u00d73\u00d7...\u00d73 (or its \ufb02attened T (k) \u2208 R3k equivalent) transforms as\nT (k) (cid:55)\u2192 (R\u2297 R\u2297 . . . \u2297 R) T (k).\nRecall that given a group G, a representation \u03c1 of G is a matrix valued function \u03c1 : G \u2192 Cd\u00d7d\nobeying \u03c1(xy) = \u03c1(x)\u03c1(y) for any two group elements x, y \u2208 G. It is easy to see that R, and\nconsequently R \u2297 . . . \u2297 R are representations of the three dimensional rotation group SO(3). We\nalso know that because SO(3) is a compact group, it has a countable sequence of unitary so-called\nirreducible representations (irreps), and, up to a similarity transformation, any representation can\nbe reduced to a direct sum of irreps. In the speci\ufb01c case of SO(3), the irreps are called Wigner\nD-matrices and for any positive integer (cid:96) = 0, 1, 2, . . . there is a single corresponding irrep D(cid:96)(R),\nwhich is a (2(cid:96) + 1) dimensional representation (i.e., as a function, D(cid:96) : SO(3) \u2192 C(2(cid:96)+1)\u00d7(2(cid:96)+1)).\nThe (cid:96) = 0 irrep is the trivial irrep D0(R) = (1).\nThe above imply that there is a \ufb01xed unitary transformation matrix C (k) which reduces the k\u2019th\norder rotation operator into a direct sum of irreducible representations:\n\nR\u2297 R\u2297 . . . \u2297 R\n\n(cid:124)\n\n(cid:123)(cid:122)\n\nk\n\n= C (k)(cid:104)(cid:77)\n\n\u03c4(cid:96)(cid:77)\n\n(cid:96)\n\ni=1\n\n(cid:125)\n\n(cid:105)\n\nD(cid:96)(R)\n\nC (k)\u2020\n\n.\n\nNote that the transformation R\u2297 R\u2297 . . . \u2297 R contains redundant copies of D(cid:96)(R), which we de-\nnote as the multiplicites \u03c4(cid:96). For our present purposes knowing the actual values of the \u03c4(cid:96) is not that\nimportant, except that \u03c4k = 1 and that for any (cid:96) > k, \u03c4(cid:96) = 0. What is important is that T (k), the\nvectorized form of the Cartesian moment tensor has a corresponding decomposition\n\nT (k) = C (k)(cid:104)(cid:77)\n\n\u03c4(cid:96)(cid:77)\n\n(cid:105)\n\nQ(cid:96),i\n\n.\n\n(4)\n\n(cid:96)\n\ni=1\n\n3\n\n\fThis is nice, because using the unitarity of Q(cid:96)i, it shows that under rotations the individual Q(cid:96),i\ncomponents transform independently as Q(cid:96),i (cid:55)\u2192 D(cid:96)(R) Q(cid:96),i.\nWhat we have just described is a form of generalized Fourier analysis applied to the transforma-\ntion of Cartesian tensors under rotations. For the electrostatic multipole problem it is particularly\nrelevant, because it turns out that in that case, due to symmetries of T (k), the only nonzero Q(cid:96),i com-\nponent of (4) is the single one with (cid:96) = k. Furthermore, for a set of N charged particles (indexing\nits components \u2212(cid:96), . . . , (cid:96)) Q(cid:96) has the simple form\n\n[Q(cid:96)]m =\n\nqi (ri)(cid:96) Y m\n\n(cid:96) (\u03b8i, \u03c6i)\n\nm = \u2212(cid:96), . . . , (cid:96),\n\n(5)\n\n(cid:18) 4\u03c0\n\n(cid:19)1/2 N(cid:88)\n\n2(cid:96) + 1\n\ni=1\n\nwhere (ri, \u03b8i, \u03c6i) are the coordinates of the i\u2019th particle in spherical polars, and the Y m\n(cid:96) (\u03b8, \u03c6) are\nthe well known spherical harmonic functions. Q(cid:96) is called the (cid:96)\u2019th spherical moment of the charge\ndistribution. Note that while T ((cid:96)) and Q(cid:96) convey exactly the same information, T ((cid:96)) is a tensor with\n3(cid:96) components, while Q(cid:96) is just a (2(cid:96) + 1) dimensional vector.\nSomewhat confusingly, in physics and chemistry any quantity U that transforms under rotations as\nU (cid:55)\u2192 D(cid:96)(R) U is often called an ((cid:96)\u2019th order) spherical tensor, despite the fact that in terms of its\npresentation Q(cid:96) is just a vector of 2(cid:96) + 1 numbers. Also note that since D0(R) = (1), a zeroth\norder spherical tensor is just a scalar. A \ufb01rst order spherical tensor, on the other hand, can be used\nto represent a spatial vector r = (r, \u03b8, \u03c6) by setting [U1]m = r Y m\n\n1 (\u03b8, \u03c6).\n\n3.1 The general form of interactions\n\nThe bene\ufb01t of the spherical tensor formalism is that it makes it very clear how each part of a given\n\nphysical equation transforms under rotations. For example, if Q(cid:96) and (cid:101)Q(cid:96) are two (cid:96)\u2019th order spherical\n(cid:96)(cid:101)Q(cid:96) is a scalar, since under a rotation R, by the unitarity of the Wigner D-matrices,\n(cid:96)(cid:101)Q(cid:96) (cid:55)\u2212\u2192 (D(cid:96)(R) Q(cid:96))\u2020 (D(cid:96)(R)(cid:101)Q(cid:96)) = Q\n\n(cid:96) (D(cid:96)(R))\u2020 D(cid:96)(R) (cid:101)Q(cid:96) = Q\n(cid:96)(cid:101)Q(cid:96).\n\n\u2020\ntensors, then Q\n\u2020\nQ\n\n\u2020\n\n\u2020\n\nEven the dipole/dipole interaction (2) requires a more sophisticated way of coupling spherical ten-\nsors than this, since it involves non-trivial interactions between not just two, but three different quan-\ntites: the two dipole moments \u00b5A and \u00b5B and the the relative position vector rAB. Representing\ninteractions of this type requires taking tensor products of the constituent variables. For example,\nin the dipole/dipole case we need terms of the form QA\n. Naturally, these will transform\n(cid:96)1\naccording to the tensor product of the corresponding irreps:\n\n\u2297 QB\n\n(cid:96)2\n\n\u2297 QB\n\n(cid:96)2\n\nQA\n(cid:96)1\n\n(cid:55)\u2192 (D(cid:96)1(R)\u2297 D(cid:96)2(R)) (QA\n\n\u2297 QB\n\n(cid:96)2\n\n).\n\n(cid:96)1\n\nIn general, D(cid:96)1(R)\u2297 D(cid:96)2(R) is not an irreducible representation. However it does have a well\nstudied decomposition into irreducibles, called the Clebsch\u2013Gordan decomposition:\n\nD(cid:96)1(R)\u2297 D(cid:96)2(R) = C\n\n\u2020\n(cid:96)1,(cid:96)2\n\nD(cid:96)(R)\n\nC(cid:96)1,(cid:96)2 .\n\n(cid:20) (cid:96)1+(cid:96)2(cid:77)\n\n(cid:96)=|(cid:96)1\u2212(cid:96)2|\n\n(cid:21)\n\nLetting C(cid:96)1,(cid:96)2,(cid:96) \u2208 C(2(cid:96)+1)\u00d7(2(cid:96)1+1)(2(cid:96)2+2) be the block of 2(cid:96) + 1 rows in C(cid:96)1,(cid:96)2 corresponding to the\n(cid:96) component of the direct sum, we see that C(cid:96)1,(cid:96)2,(cid:96)(QA\n) is an (cid:96)\u2019th order spherical tensor. In\n(cid:96)1\nparticular, given some other spherical tensor quantity U(cid:96),\n\u2020\n(cid:96) \u00b7 C(cid:96)1,(cid:96)2,(cid:96) \u00b7 (QA\nU\n\n\u2297 QB\n\n\u2297 QB\n\n(cid:96)2\n\n)\n\n(cid:96)2\n\n(cid:96)1\n\nis a scalar, and hence it is a candidate for being a term in the potential energy. Note the similarity\nof this expression to the bispectrum [Kakarala, 1992, Bendory et al., 2018], which is an already\nestablished tool in the force \ufb01eld learning literature [Bart\u00f3k et al., 2013].\nAlmost any rotation invariant interaction potential can be expressed in terms of iterated Clebsch\u2013\nGordan products between spherical tensors. In particular, the full electrostatic energy between two\nsets of charges A and B separated by a vector r = (r, \u03b8, \u03c6) expressed in multipole form [Jackson,\n1999] is\n\n\u221e(cid:88)\n\n\u221e(cid:88)\n\n(cid:115)(cid:18)2(cid:96) + 2(cid:96)(cid:48)\n\n(cid:19)(cid:114) 4\u03c0\n\n(cid:96)=0\n\n(cid:96)(cid:48)=0\n\n2(cid:96)\n\nVAB =\n\n1\n\n4\u03c0\u00010\n\n2(cid:96) + 2(cid:96)(cid:48) + 1\n\nr\u2212((cid:96)+(cid:96)(cid:48)+1) Y(cid:96)+(cid:96)(cid:48)(\u03b8, \u03c6) C(cid:96)1,(cid:96)2,(cid:96)+(cid:96)(cid:48) (QA\n\n(cid:96) \u2297 QB\n(cid:96)(cid:48) ).\n(6)\n\n4\n\n\fNote the generality of this formula: the (cid:96) = (cid:96)(cid:48) = 1 case covers the dipole/dipole interaction (2),\nthe (cid:96) = (cid:96)(cid:48) = 2 case covers the quadropole/quadropole interaction (3), while the other terms cover\nevery other possible type of multipole/multipole interaction. Magnetic and other types of interac-\ntions, including interactions that involve 3-way or higher order terms, can also be recovered from\nappropriate combinations of tensor products and Clebsch\u2013Gordan decompositions.\nWe emphasize that our discussion of electrostatics is only intended to illustrate the algebraic struc-\nture of interatomic interactions of any type, and is not restricted to electrostatics. In what follows, we\nwill not explicitly specify what interactions the network will learn. Nevertheless, there are physical\nconstraints on the interactions arising from symmetries, which we explicitly impose in our design of\nCormorant.\n\n4 CORMORANT: COvaRiant MOleculaR Arti\ufb01cial Neural neTworks\n\nThe goal of using ML in molecular problems is not to encode known physical laws, but to provide\na platform for learning interactions from data that cannot easily be captured in a simple formula.\nNonetheless, the mathematical structure of known physical laws, like those discussed in the previ-\nous sections, give strong hints about how to represent physical interactions in algorithms. In partic-\nular, when using machine learning to learn molecular potentials or similar rotation and translation\ninvariant physical quantities, it is essential to make sure that the algorithm respects these invariances.\nOur Cormorant neural network has invariance to rotations baked into its architecture in a way that\nis similar to the physical equations of the previous section: the internal activations are all spherical\ntensors, which are then combined at the top of the network in such a way as to guarantee that the\n\ufb01nal output is a scalar (i.e., is invariant). However, to allow the network to learn interactions that are\nmore complicated than classical interatomic forces, we allow each neuron to output not just a single\nspherical tensor, but a combination of spherical tensors of different orders. We will call an object\nconsisting of \u03c40 scalar components, \u03c41 components transforming as \ufb01rst order spherical tensors, \u03c42\ncomponents transforming as second order spherical tensors, and so on, an SO(3)\u2013covariant vector\nof type (\u03c40, \u03c41, \u03c42, . . .). The output of each neuron in Cormorant is an SO(3)\u2013vector of a \ufb01xed type.\nDe\ufb01nition 1. We say that F is an SO(3)-covariant vector of type \u03c4 = (\u03c40, \u03c41, \u03c42, . . . , \u03c4L) if it can\nbe written as a collection of complex matrices F0, F1, . . . , FL, called its isotypic parts, where each\nF(cid:96) is a matrix of size (2(cid:96) + 1)\u00d7 \u03c4(cid:96) and transforms under rotations as F(cid:96) (cid:55)\u2192 D(cid:96)(R) F(cid:96).\nThe second important feature of our architecture is that each neuron corresponds to either a single\natom or a set of atoms forming a physically meaningful subset of the system at hand, for example\nall atoms in a ball of a given radius. This condition helps encourage the network to learn physically\nmeaningful and interpretable interactions. The high level de\ufb01nition of Cormorant nets is as follows.\nDe\ufb01nition 2. Let S be a molecule or other physical system consisting of N atoms. A \u201cCormorant\u201d\ncovariant molecular neural network for S is a feed forward neural network consisting of m neurons\nn1, . . . , nm, such that\nC1. Every neuron ni corresponds to some subset Si of the atoms. In particular, each input neuron\n\ncorresponds to a single atom. Each output neuron corresponds to the entire system S.\n\nC2. The activation of each ni is an SO(3)-vector of a \ufb01xed type \u03c4i.\nC3. The type of each output neuron is \u03c4out = (1), i.e., a scalar. 1\n\nCondition (C3) guarantees that whatever function a Cormorant network learns will be invariant to\nglobal rotations. Translation invariance is easier to enforce simply by making sure that the interac-\ntions represented by individual neurons only involve relative distances.\n\n4.1 Covariant neurons\n\nThe neurons in our network must be such that if each of their inputs is an SO(3)\u2013covariant vector\nthen so is their output. Classically, neurons perform a simple linear operation such as x (cid:55)\u2192 W x + b,\nfollowed by a nonlinearity like a ReLU. In convolutional neural nets the weights are tied together in\n\n1Cormorant can learn data of arbitrary SO(3)-vector outputs. We restrict to scalars here to simplify the\n\nexposition.\n\n5\n\n\fa speci\ufb01c way which guarantees that the activation of each layer is covariant to the action of global\ntranslations. Kondor and Trivedi [2018] discuss the generalization of convolution to the action of\ncompact groups (such as, in our case, rotations) and prove that the only possible linear operation that\nis covariant with the group action, is what, in terms of SO(3)\u2013vectors, corresponds to multiplying\neach F(cid:96) matrix from the right by some matrix W of learnable weights.\nFor the nonlinearity, one option would be to express each spherical tensor as a function on SO(3)\nusing an inverse SO(3) Fourier transform, apply a pointwise nonlinearity, and then transform the\nresulting function back into spherical tensors. This is the approach taken in e.g., [Cohen et al.,\n2018]. However, in our case this would be forbiddingly costly, as well as introducing quadra-\nture errors by virtual of having to interpolate on the group, ultimately degrading the network\u2019s co-\nvariance. Instead, taking yet another hint from the structure of physical interactions, we use the\nClebsch\u2013Gordan transform introduced in 3.1 as a nonlinearity. The general rule for taking the CG\nproduct of two SO(3)\u2013parts F(cid:96)1 \u2208 C(2(cid:96)1+1)\u00d7n1 and G(cid:96)2 \u2208 C(2(cid:96)2+1)\u00d7n2 gives a collection of parts\n[F(cid:96)1 \u2297cg G(cid:96)2 ]|(cid:96)1\u2212(cid:96)1|, . . . [F(cid:96)1 \u2297cg G(cid:96)2](cid:96)1+(cid:96)1 with columns\n\n(cid:2)[F(cid:96)1 \u2297cg G(cid:96)2 ](cid:96)\n\n(cid:3)\n\u2217,(i1,i2) = C(cid:96)1,(cid:96)2,(cid:96) ([F(cid:96)1 ]\u2217,i1 \u2297 [G(cid:96)2]\u2217,i2) ,\n\n(7)\n\ni.e., every column of F(cid:96)1 is separately CG-multiplied with every column of G(cid:96)2. The (cid:96)\u2019th part of\nthe CG-product of two SO(3)\u2013vectors consists of the concatenation of all SO(3)\u2013part matrices with\nindex (cid:96) coming from multiplying each part of F with each part of G:\n[F(cid:96)1 \u2297cg G(cid:96)2](cid:96).\n\n[F \u2297cg G](cid:96) =\n\n(cid:77)\n\n(cid:77)\n\n(cid:96)1\n\n(cid:96)2\n\nHere and in the following \u2295 denotes the appropriate concatenation of vectors and matrices.\nIn\nCormorant, however, as a slight departure from (7), to reduce the quadratic blow-up in the number\nof columns, we always have n1 = n2 and use the restricted \u201cchannel-wise\u201d CG-product,\n\n(cid:2)[F(cid:96)1 \u2297cg G(cid:96)2](cid:96)\n\n(cid:3)\n\u2217,i = C(cid:96)1,(cid:96)2,(cid:96) ([F(cid:96)1]\u2217,i \u2297 [G(cid:96)2 ]\u2217,i) ,\n\nwhere each column of F(cid:96)1 is only mixed with the corresponding column of G(cid:96)2. We note that similar\nClebsch\u2013Gordan nonlinearities were used in [Kondor et al., 2018], and that the Clebsch\u2013Gordan\nproduct is also an essential part of Tensor Field Networks [Thomas et al., 2018].\n\n4.2 One-body and two-body interactions\n\nAs stated in De\ufb01nition 2, the covariant neurons in a Cormorant net correspond to different subsets\nof the atoms making up the physical system to be modeled. For simplicty in our present architecture\nthere are only two types of neurons: those that correspond to individual atoms and those that corre-\nspond to pairs. For a molecule consisting of N atoms, each layer s = 0, 1, . . . , S of the covariant\npart of the network has N neurons corresponding to the atoms and N 2 neurons corresponding to the\n(i, j) atom pairs. By loose analogy with graph neural networks, we call the corresponding F s\ni and\ni,j activations vertex and edge activations, respectively.\ngs\ni activation is an SO(3)\u2013vector consisting of L+1 distinct\nIn accordance with the foregoing, each F s\nis a (2(cid:96)+1)\u00d7\u03c4 s\n), i.e., each F s,(cid:96)\nparts (F s,0\n(cid:96) dimensional complex matrix that trans-\ni\n(cid:55)\u2192 D(cid:96)(R) F s,(cid:96)\nforms under rotations as F s,(cid:96)\n. The different columns of these matrices are regarded\nas the different channels of the network, because they ful\ufb01ll a similar role to channels in conven-\ni,j , . . . , gs,L\ntional convolutional nets. The gs\ni,j ),\nbut these are invariant under rotations. Again for simplicity, in the version of Cormorant that we used\nin our experiments L is the same in every layer (speci\ufb01cally L = 3), and the number of channels is\nalso independent of both s and (cid:96), speci\ufb01cally, \u03c4 s\nThe actual form of the vertex activations captures \u201cone-body interactions\u201d propagating information\nfrom the previous layer related to the same atom and (indirectly, via the edge activations) \u201ctwo-body\ninteractions\u201d capturing interactions between pairs of atoms:\n\ni,j edge activations also break down into parts (gs,0\n\n, . . . , F s,L\n\ni,j , gs,1\n\n, F s,1\n\ni\n\ni\n\ni\n\ni\n\ni\n\n(cid:104)\n\ni \u2295(cid:0)F s\u22121\n(cid:123)(cid:122)\ni \u2297cg F s\u22121\n\ni\n\none-body part\n\n(cid:124)\n\nF s\n\nF s\u22121\n\ni\n\n=\n\n(cid:96) \u2261 nc = 16.\n(cid:1)\n(cid:125)\n\n\u2295(cid:16)(cid:88)\n(cid:124)\n\nj\n\n6\n\n(cid:105) \u00b7 W vertex\n\ns,(cid:96)\n\n(cid:17)\n(cid:125)\n\ni,j \u2297cg F s\u22121\n(cid:123)(cid:122)\nGs\n\ntwo-body part\n\nj\n\n.\n\n(8)\n\n\f(cid:104)(cid:0) gs\u22121,(cid:96)\ni,j \u2295(cid:0)F s\u22121\n\ni\n\n\u00b7 F s\u22121\n\nj\n\n(cid:1) \u2295 \u03b7s,(cid:96)(ri,j)(cid:1) W edge\n\n(cid:105)\n\ni,j Y (cid:96)((cid:98)ri,j),\nwhere Y (cid:96)((cid:98)ri,j) are the spherical harmonic vectors capturing the relative position of atoms i and j.\n\ni,j are SO(3)\u2013vectors arising from the edge network. Speci\ufb01cally, Gs,(cid:96)\n\ni,j = gs,(cid:96)\n\nHere Gs\n\nThe edge activations, in turn, are de\ufb01ned\n\ns,(cid:96)\n\ngs,(cid:96)\ni,j = \u00b5s(ri,j)\n\n(9)\nwhere we made the (cid:96) = 0, 1, . . . , L irrep index explicit. As before, in these formulae, \u2295 denotes\nconcatenation over the channel index c, \u03b7s,(cid:96)\nc(ri,j) are\nlearnable cutoff functions limiting the in\ufb02uence of atoms that are farther away from atom i. The\nlearnable parameters of the network are the {W vertex\nNote that the F s\u22121\ndot product term is the only term in these formulae responsible for\nthe interaction between different atoms, and that this term always appears in conjunction with the\nc (ri,j) radial basis functions and \u00b5s\nc(ri,j) cutoff functions (as well as the SO(3)\u2013covariant spher-\n\u03b7s,(cid:96)\nical harmonic vector) making sure that interaction scales with the distance between the atoms. More\ndetails of these activation rules are given in the Supplement.\n\nc (ri,j) are learnable radial functions, and \u00b5s\n\ns,(cid:96) } weight matrices.\n\ns,(cid:96) } and {W edge\n\n\u00b7 F s\u22121\n\nj\n\ni\n\n4.3 Overall structure and comparison with other architectures\n\nj\n\nfrom the activations F s\n\n} \u2190 CGNet({F s\n\ni\nSO(3)-vector of type \u03c4 s\ni .\n\nIn addition to the covariant neurons described above, our network also needs neurons to compute\nthe input featurization and the the \ufb01nal output after the covariant layers. Thus, in total, a Cormorant\nnetworks consists of three distinct parts:\n1. An input featurization network {F s=0\n2. An S-layer network {F s+1\n\ncharges/identities and (optionally) a scalar function of relative positions ri,j.\ni })of covariant activations F s\n\n} \u2190 INPUT({Zi, ri,j}) that operates only on atomic\n\n3. A rotation invariant network at the top y \u2190 OUTPUT((cid:76)S\n\ns=0{F s\ni , and uses them to predict a regression target y.\nWe leave the details of the input and output featurization to the Supplement.\nA key difference between Cormorant and other recent covariant networks (Tensor Field Net-\nworks [Thomas et al., 2018] and SE(3)-equivariant networks [Weiler et al., 2018]) is the use of\nClebsch-Gordan non-linearities. The Clebsch-Gordan non-linearity results in a complete interac-\ntion of every degree of freedom in an activation. This comes at the cost of increased dif\ufb01culty in\ntraining, as discussed in the Supplement. We further note that SE(3)-equivariant networks use a\nthree-dimensional grid of points to represent data, and ensure both translational and rotational co-\nvariance (equivariance) of each layer. Cormorant on the other hand uses activations that are covariant\nto rotations, and strictly invariant to translations.\n\ni , each of which is a\ni }) that constructs scalars\n\n5 Experiments\n\nWe present experimental results on two datasets of interest to the computational chemistry com-\nmunity: MD-17 for learning molecular force \ufb01elds and potential energy surfaces, and QM-9 for\nlearning the ground state properties of a set of molecules. The supplement provides a detailed sum-\nmary of all hyperparameters, our training algorithm, and the details of the input/output levels used\nin both cases. Our code is available at https://github.com/risilab/cormorant.\nQM9 [Ramakrishnan et al., 2014] is a dataset of approximately 134k small organic molecules con-\ntaining the atoms H, C, N, O, F. For each molecule, the ground state con\ufb01guration is calculated\nusing DFT, along with a variety of molecular properties. We use the ground state con\ufb01guration as\nthe input to our Cormorant, and use a common subset of properties in the literature as regression tar-\ngets. Table 1(a) presents our results averaged over three training runs compared with SchNet [Sch\u00fctt\net al., 2017], MPNNs [Gilmer et al., 2017], and wavelet scattering networks [Hirn et al., 2017]. Of\nthe twelve regression targets considered, we achieve leading or competitive results on six (\u03b1, \u2206\u0001,\n\u0001HOMO, \u0001LUMO, \u00b5, Cv). The remaining four targets are within 40% of the best result, with the\nexception of R2.\nMD-17 [Chmiela et al., 2016] is a dataset of eight small organic molecules (see Table 1(b)) con-\ntaining up to 17 total atoms composed of the atoms H, C, N, O, F. For each molecule, an ab\n\n7\n\n\fTable 1: Mean absolute error of various prediction targets on QM-9 (left) and conformational\nenergies (in units of kcal/mol) on MD-17 (right). The best results within a standard deviation of\nthree Cormorant training runs (in parenthesis) are indicated in bold.\n\nCormorant\n\nSchNet NMP WaveScatt\n\n\u03b1 (bohr3)\n\u2206\u0001 (eV)\n\u0001HOMO (eV)\n\u0001LUMO (eV)\n\u00b5 (D)\nCv (cal/mol K)\nG (eV)\nH (eV)\nR2 (bohr2)\nU (eV)\nU0 (eV)\nZPVE (meV)\n\n0.085 (0.001)\n0.061 (0.005)\n0.034 (0.002)\n0.038 (0.008)\n0.038 (0.009)\n0.026 (0.000)\n(0.000)\n0.020\n(0.001)\n0.021\n0.961\n(0.019)\n(0.000)\n0.021\n(0.003)\n0.022\n2.027\n(0.042)\n\n0.235\n0.063\n0.041\n0.034\n0.033\n0.033\n0.014\n0.014\n0.073\n0.019\n0.014\n1.700\n\n0.092\n0.069\n0.043\n0.038\n0.030\n0.040\n0.019\n0.017\n0.180\n0.020\n0.020\n1.500\n\n0.160\n0.118\n0.085\n0.076\n0.340\n0.049\n0.022\n0.022\n0.410\n0.022\n0.022\n2.000\n\nCormorant DeepMD DTNN SchNet GDML sGDML\n\nAspirin\nBenzene\nEthanol\nMalonaldehyde\nNaphthalene\nSalicylic Acid\nToluene\nUracil\n\n0.098\n0.023\n0.027\n0.041\n0.029\n0.066\n0.034\n0.023\n\n0.201\n0.065\n0.055\n0.092\n0.095\n0.106\n0.085\n0.085\n\n\u2013\n0.040\n\u2013\n0.190\n\u2013\n0.410\n0.180\n\u2013\n\n0.120\n0.070\n0.050\n0.080\n0.110\n0.100\n0.090\n0.100\n\n0.270\n0.070\n0.150\n0.160\n0.120\n0.120\n0.120\n0.110\n\n0.190\n0.100\n0.070\n0.100\n0.120\n0.120\n0.100\n0.110\n\ninitio molecular dynamics simulation was run using DFT to calculate the ground state energy and\nforces. At intermittent timesteps, the energy, forces, and con\ufb01guration (positions of each atom) were\nrecorded. For each molecule we use a train/validation/test split of 50k/10k/10k atoms respectively.\nThe results of these experiments are presented in Table 1(b), where the mean-average error (MAE)\nis plotted on the test set for each of molecules. (All units are in kcal/mol, as consistent with the\ndataset and the literature.) To the best of our knowledge, the current state-of-the art algorithms on\nthis dataset are DeepMD [Zhang et al., 2018], DTNN [Sch\u00fctt et al., 2017], SchNet [Sch\u00fctt et al.,\n2017], GDML [Chmiela et al., 2016], and sGDML [Chmiela et al., 2018]. Since training and testing\nset sizes were not consistent, we used a training set of 50k molecules to compare with all neural\nnetwork based approaches. As can be seen from the table, our Cormorant network outperforms all\ncompetitors.\n\n6 Conclusions\n\nTo the best of our knowledge, Cormorant is the \ufb01rst neural network architecture in which the opera-\ntions implemented by the neurons is directly motivated by the form of known physical interactions.\nRotation and translation invariance are explicitly \u201cbaked into\u201d the network by the fact all activations\nare represented in spherical tensor form (SO(3)\u2013vectors), and the neurons combine Clebsch\u2013Gordan\nproducts, concatenation of parts and mixing with learnable weights, all of which are covariant op-\nerations. In future work we envisage the potentials learned by Cormorant to be directly integrated\nin MD simulation frameworks. In this regard, it is very encouraging that on MD-17, which is the\nstandard benchmark for force \ufb01eld learning, Cormorant outperforms all other competing methods.\nLearning from derivatives (forces) and generalizing to other compact symmetry groups are natural\nextensions of the persent work.\n\nAcknowledgements\n\nThis project was supported by DARPA \u201cPhysics of AI\u201d grant number HR0011837139, and used\ncomputational resources acquired through NSF MRI 1828629.\n\n8\n\n\fReferences\nAlbert P Bart\u00f3k, Michael C Payne, Risi Kondor, and G\u00e1bor Cs\u00e1nyi. Gaussian Approximation Potentials: the\n\naccuracy of quantum mechanics, without the electrons. Phys Rev Lett, 104(13):136403, 2010.\n\nAlbert P. Bart\u00f3k, Risi Kondor, and G\u00e1bor Cs\u00e1nyi. On representing chemical environments. Phys. Rev. B, 87:\n\n184115, May 2013.\n\nJ\u00f6rg Behler and Michele Parrinello. Generalized neural-network representation of high-dimensional potential-\n\nenergy surfaces. Phys Rev Lett, 98(14):146401, 2007.\n\nTamir Bendory, Nicolas Boumal, Chao Ma, Zhizhen Zhao, and Amit Singer. Bispectrum inversion with appli-\ncation to multireference alignment. Trans. Sig. Proc., 66(4):1037\u20131050, February 2018. ISSN 1053-587X.\ndoi: 10.1109/TSP.2017.2775591.\n\nM. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst. Geometric deep learning: Going beyond\neuclidean data. IEEE Signal Process. Mag., 34(4):18\u201342, July 2017. ISSN 1053-5888. doi: 10.1109/MSP.\n2017.2693418.\n\nB. R. Brooks, C. L. Brooks, A. D. Mackerell, L. Nilsson, R. J. Petrella, B. Roux, Y. Won, G. Archontis, C. Bar-\ntels, S. Boresch, and et al. CHARMM: the biomolecular simulation program. Journal of Computational\nChemistry, 30(10):1545\u20131614, Jul 2009. ISSN 1096-987X.\n\nBernard R. Brooks, Robert E. Bruccoleri, Barry D. Olafson, David J. States, S. Swaminathan, and Martin\nKarplus. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations.\nJournal of Computational Chemistry, 4(2):187\u2013217, Jun 1983. ISSN 1096-987X.\n\nJ. Bruna, W. Zaremba, A. Szlam, and Y. LeCun. Spectral networks and locally connected networks on graphs.\n\n3, 2014.\n\nStefan Chmiela, Alexandre Tkatchenko, Huziel E. Sauceda, Igor Poltavsky, Kristof T. Sch\u00fctt, and Klaus-Robert\nM\u00fcller. Machine Learning of Accurate Energy-Conserving Molecular Force Fields. (May):1\u20136, 2016. ISSN\n2375-2548.\n\nStefan Chmiela, Huziel E. Sauceda, Klaus-Robert Muller, and Alexandre Tkatchenko. Towards exact molecular\ndynamics simulations with machine-learned force \ufb01elds. Nature Communications, 9(1):3887, 2018. doi:\n10.1038/s41467-018-06169-2. URL https://doi.org/10.1038/s41467-018-06169-2.\n\nTaco S. Cohen and Max Welling. Group equivariant convolutional networks. CoRR, abs/1602.07576, 2016a.\n\nURL http://arxiv.org/abs/1602.07576.\n\nTaco S. Cohen and Max Welling. Steerable cnns. CoRR, abs/1612.08498, 2016b. URL http://arxiv.org/\n\nabs/1612.08498.\n\nTaco S. Cohen, Mario Geiger, Jonas K\u00f6hler, and Max Welling. Spherical cnns. CoRR, abs/1801.10130, 2018.\n\nURL http://arxiv.org/abs/1801.10130.\n\nTaco S. Cohen, Maurice Weiler, Berkay Kicanaoglu, and Max Welling. Gauge equivariant convolutional net-\nworks and the icosahedral CNN. CoRR, abs/1902.04615, 2019. URL http://arxiv.org/abs/1902.\n04615.\n\nCarlos Esteves, Christine Allen-Blanchette, Ameesh Makadia, and Kostas Daniilidis. 3d object classi\ufb01cation\nand retrieval with spherical cnns. CoRR, abs/1711.06721, 2017. URL http://arxiv.org/abs/1711.\n06721.\n\nJustin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. Neural message\npassing for quantum chemistry. CoRR, abs/1704.01212, 2017. URL http://arxiv.org/abs/1704.\n01212.\n\nMikael Henaff, Joan Bruna, and Yann LeCun. Deep convolutional networks on graph-structured data. CoRR,\n\nabs/1506.05163, 2015. URL http://arxiv.org/abs/1506.05163.\n\nM. Hirn, S. Mallat, and N. Poilvert. Wavelet scattering regression of quantum chemical energies. Multiscale\n\nModeling & Simulation, 15(2):827\u2013863, Jan 2017. ISSN 1540-3459.\n\nP. Hohenberg and W. Kohn. Inhomogeneous electron gas. Phys. Rev., 136:864\u2013871, 1964.\nJohn David Jackson. Classical electrodynamics. Wiley, New York, NY, 3rd ed. edition, 1999.\n\n9780471309321. URL http://cdsweb.cern.ch/record/490457.\n\nISBN\n\nRamakrishna Kakarala. Triple correlation on groups. PhD thesis, Department of Mathematics, UC Irvine,\n\n1992.\n\nR. Kondor and S. Trivedi. On the generalization of equivariance and convolution in neural networks to the\n\naction of compact groups. International Conference on Machine Learning (ICML), 2018.\n\nRisi Kondor. N-body networks: a covariant hierarchical neural network architecture for learning atomic poten-\n\ntials. CoRR, abs/1803.01588, 2018. URL http://arxiv.org/abs/1803.01588.\n\nRisi Kondor, Zhen Lin, and Shubhendu Trivedi. Clebsch\u2013gordan nets: a fully fourier space spherical con-\nvolutional neural network. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and\nR. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 10117\u201310126. Curran\nAssociates, Inc., 2018.\n\n9\n\n\fJonathan Masci, Davide Boscaini, Michael M. Bronstein, and Pierre Vandergheynst. Geodesic convolutional\nneural networks on riemannian manifolds. CoRR, abs/1501.06297, 2015. URL http://arxiv.org/abs/\n1501.06297.\n\nG. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K-R. M\u00fcller, and O. A.\nvon Lilienfeld. Machine learning of molecular electronic properties in chemical compound space. New J.\nPhys., 15, 09 2013.\n\nFederico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodol\u00e0, Jan Svoboda, and Michael M. Bronstein.\nGeometric deep learning on graphs and manifolds using mixture model cnns. CoRR, abs/1611.08402, 2016.\nURL http://arxiv.org/abs/1611.08402.\n\nRaghunathan Ramakrishnan, Pavlo O Dral, Matthias Rupp, and O Anatole von Lilienfeld. Quantum chemistry\n\nstructures and properties of 134 kilo molecules. Scienti\ufb01c Data, 1, 2014.\n\nM. Rupp, A. Tkatchenko, K. R. M\u00fcller, and O. A. von Lilienfeld. Fast and accurate modeling of molecular\n\natomization energies with machine learning. Phys. Rev. Lett., 108, 2012.\n\nKristof Sch\u00fctt, Pieter-Jan Kindermans, Huziel Enoc Sauceda Felix, Stefan Chmiela, Alexandre Tkatchenko,\nand Klaus-Robert M\u00fcller. Schnet: A continuous-\ufb01lter convolutional neural network for modeling quantum\ninteractions. 2017.\n\nKristof T. Sch\u00fctt, Farhad Arbabzadah, Stefan Chmiela, Klaus R. M uller, and Alexandre Tkatchenko. Quantum-\nchemical insights from deep tensor neural networks. Nature Communications, 8:13890, Jan 2017. ISSN\n2041-1723.\n\nAlexander V Shapeev. Moment Tensor Potentials: a class of systematically improvable interatomic potentials.\n\narXiv, December 2015.\n\nJ. S. Smith, O. Isayev, and A. E. Roitberg. Ani-1: an extensible neural network potential with dft accuracy at\n\nforce \ufb01eld computational cost. Chem. Sci., 8:3192\u20133203, 2017. doi: 10.1039/C6SC05720A.\n\nA.J. Stone. The Theory of Intermolecular Forces. International Series of Monographs on Chemistry. Clarendon\n\nPress, 1997. ISBN 9780198558835.\n\nNathaniel Thomas, Tess Smidt, Steven M. Kearnes, Lusann Yang, Li Li, Kai Kohlhoff, and Patrick Riley.\nTensor \ufb01eld networks: Rotation- and translation-equivariant neural networks for 3d point clouds. CoRR,\nabs/1802.08219, 2018.\n\nMaurice Weiler, Mario Geiger, Max Welling, Wouter Boomsma, and Taco Cohen. 3d steerable cnns: Learning\nrotationally equivariant features in volumetric data. CoRR, abs/1807.02547, 2018. URL http://arxiv.\norg/abs/1807.02547.\n\nKun Yao, John E. Herr, David[space]W. Toth, Ryker Mckintyre, and John Parkhill. The tensormol-0.1 model\nchemistry: a neural network augmented with long-range physics. Chem. Sci., 9:2261\u20132269, 2018. doi:\n10.1039/C7SC04934J.\n\nLinfeng Zhang, Jiequn Han, Han Wang, Roberto Car, and Weinan E. Deep potential molecular dynamics: A\nscalable model with the accuracy of quantum mechanics. Phys. Rev. Lett., 120:143001, Apr 2018. doi:\n10.1103/PhysRevLett.120.143001. URL https://link.aps.org/doi/10.1103/PhysRevLett.120.\n143001.\n\n10\n\n\f", "award": [], "sourceid": 8228, "authors": [{"given_name": "Brandon", "family_name": "Anderson", "institution": "University of Chicago"}, {"given_name": "Truong Son", "family_name": "Hy", "institution": "The University of Chicago"}, {"given_name": "Risi", "family_name": "Kondor", "institution": "U. Chicago"}]}