{"title": "A General Theory of Equivariant CNNs on Homogeneous Spaces", "book": "Advances in Neural Information Processing Systems", "page_first": 9145, "page_last": 9156, "abstract": "We present a general theory of Group equivariant Convolutional Neural Networks (G-CNNs) on homogeneous spaces such as Euclidean space and the sphere. Feature maps in these networks represent fields on a homogeneous base space, and layers are equivariant maps between spaces of fields. The theory enables a systematic classification of all existing G-CNNs in terms of their symmetry group, base space, and field type. We also answer a fundamental question: what is the most general kind of equivariant linear map between feature spaces (fields) of given types? We show that such maps correspond one-to-one with generalized convolutions with an equivariant kernel, and characterize the space of such kernels.", "full_text": "A General Theory of Equivariant CNNs on\n\nHomogeneous Spaces\n\nTaco S. Cohen\n\nQualcomm AI Research\u2217\n\nMario Geiger\n\nPCSL Research Group\n\nQualcomm Technologies Netherlands B.V.\n\nEPFL\n\ntacos@qti.qualcomm.com\n\nmario.geiger@epfl.ch\n\nMaurice Weiler\n\nQUVA Lab\n\nU. of Amsterdam\nm.weiler@uva.nl\n\nAbstract\n\nWe present a general theory of Group equivariant Convolutional Neural Networks\n(G-CNNs) on homogeneous spaces such as Euclidean space and the sphere. Feature\nmaps in these networks represent \ufb01elds on a homogeneous base space, and layers\nare equivariant maps between spaces of \ufb01elds. The theory enables a systematic\nclassi\ufb01cation of all existing G-CNNs in terms of their symmetry group, base\nspace, and \ufb01eld type. We also consider a fundamental question: what is the most\ngeneral kind of equivariant linear map between feature spaces (\ufb01elds) of given\ntypes? Following Mackey, we show that such maps correspond one-to-one with\nconvolutions using equivariant kernels, and characterize the space of such kernels.\n\n1\n\nIntroduction\n\nThrough the use of convolution layers, Convolutional Neural Networks (CNNs) have a built-in\nunderstanding of locality and translational symmetry that is inherent in many learning problems.\nBecause convolutions are translation equivariant (a shift of the input leads to a shift of the output),\nconvolution layers preserve the translation symmetry. This is important, because it means that further\nlayers of the network can also exploit the symmetry.\n\nMotivated by the success of CNNs, many researchers have worked on generalizations, leading to a\ngrowing body of work on Group equivariant CNNs (G-CNNs) for signals on Euclidean space and\nthe sphere [1\u20137] as well as graphs [8, 9]. With the proliferation of equivariant network layers, it has\nbecome dif\ufb01cult to see the relations between the various approaches. Furthermore, when faced with\na new modality (diffusion tensor MRI, say), it may not be immediately obvious how to create an\nequivariant network for it, or whether a given kind of equivariant layer is the most general one.\n\nIn this paper we present a general theory of homogeneous G-CNNs. Feature spaces are modelled\nas spaces of \ufb01elds on a homogeneous space. They are characterized by a group of symmetries\nG, a subgroup H \u2264 G that together with G determines a homogeneous space B \u2243 G/H, and a\nrepresentation \u03c1 of H that determines the type of \ufb01eld (vector, tensor, etc.). Related work is classi\ufb01ed\nby (G, H, \u03c1). The main theorems say that equivariant linear maps between \ufb01elds over B can be\nwritten as convolutions with an equivariant kernel, and that the space of equivariant kernels can be\nrealized in three equivalent ways. We will assume some familiarity with groups, cosets, quotients,\nrepresentations and related notions (see Appendix A).\n\nThis paper does not contain truly new mathematics (in the sense that a professional mathematician\nwith expertise in the relevant subjects would not be surprised by our results), but instead provides\na new formalism for the study of equivariant convolutional networks. This formalism turns out to\nbe a remarkably good \ufb01t for describing real-world G-CNNs. Moreover, by describing G-CNNs in a\nlanguage used throughout modern physics and mathematics (\ufb01elds, \ufb01ber bundles, etc.), it becomes\npossible to apply knowledge gained over many decades in those domains to machine learning.\n\n*Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc.\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\f1.1 Overview of the Theory\n\nThis paper has two main parts. First, in Sec. 2, we introduce a mathematical model for convolutional\nfeature spaces. The basic idea is that feature maps represent \ufb01elds over a homogeneous space. As it\nturns out, de\ufb01ning the notion of a \ufb01eld is quite a bit of work. So in order to motivate the introduction\nof each of the required concepts, we will in this section provide an overview of the relevant concepts\nand their relations, using the example of a Spherical CNN with vector \ufb01eld feature maps.\n\nThe second part of this paper (Section 3) is about maps between the feature spaces. We require these\nto be equivariant, and focus in particular on the linear layers. The main theorems (3.1\u20133.4) show that\nlinear equivariant maps between the feature spaces are in one-to-one correspondence with equivariant\nconvolution kernels (i.e. convolution is all you need), and that the space of equivariant kernels can be\nrealized as a space of matrix-valued functions on a group, coset space, or double coset space, subject\nto linear constraints.\n\nIn order to specify a convolutional feature space, we need to specify two things: a homogeneous\nspace B over which the \ufb01eld is de\ufb01ned, and the type of \ufb01eld (e.g. vector \ufb01eld, tensor \ufb01eld, etc.). A\nhomogeneous space for a group G is a space B where for any two x, y \u2208 B there is a transformation\ng \u2208 G that relates them via gx = y. Here we consider the example of a vector \ufb01eld on the sphere\nB = S2 with symmetry group G = SO(3), the group of 3D rotations. The sphere is a homogeneous\nspace for SO(3) because we can map any point on the sphere to any other via a rotation.\n\nFormally, a \ufb01eld is de\ufb01ned as a section of a vector bundle associated to a principal bundle. In order\nto understand what this means, we must \ufb01rst know what a \ufb01ber bundle is (Sec. 2.1), and understand\nhow the group G can be viewed as a principal bundle (Sec. 2.2). Brie\ufb02y, a \ufb01ber bundle formalizes the\nidea of parameterizing a set of identical spaces called \ufb01bers by another space called the base space.\n\nThe \ufb01rst way in which \ufb01ber bundles play a role in the theory is that the action\nof G on B allows us to think of G as a \u201cbundle of groups\u201d or principal bundle.\nRoughly speaking, this works as follows: if we \ufb01x an origin o \u2208 B, we\ncan consider the stabilizer subgroup H \u2264 G of transformations that leave o\nunchanged: H = {g \u2208 G | go = o}. For example, on the sphere the stabilizer\nis SO(2), the group of rotations around the axis through o (e.g. the north pole).\nAs we will see in Section 2.2, this allows us to view G as a bundle with base\nspace B \u2243 G/H and a \ufb01ber H. This is shown for the sphere in Fig. 1 (cartoon).\nIn this case, we can think of SO(3) as a bundle of circles (H = SO(2)) over\nthe sphere, which itself is the quotient S2 \u2243 SO(3)/ SO(2).\n\nFigure 1: SO(3) as\na principal SO(2)\nbundle over S2.\n\nTo de\ufb01ne the associated bundle (Sec. 2.3) we take the principal bundle G and\nreplace the \ufb01ber H by a vector space V on which H acts linearly via a group\nrepresentation \u03c1. This yields a vector bundle with the same base space B and\na new \ufb01ber V . For example, the tangent bundle of S2 (Fig. 2) is obtained by\nreplacing the circular SO(2) \ufb01bers in Fig. 1 by 2D planes. Under the action\nof H = SO(2), a tangent vector at the north pole is rotated (even though\nthe north pole itself is \ufb01xed by SO(2)), so we let \u03c1(h) be a 2 \u00d7 2 rotation\nmatrix. In a general convolutional feature space with n channels, V would be\nan n-dimensional vector space. Finally, \ufb01elds are de\ufb01ned as sections of this\n\nFigure 2: Tangent\nbundle of S2.\n\nbundle, i.e. an assignment to each point x of an element in the \ufb01ber over x (see Fig. 3).\n\nHaving de\ufb01ned the feature space, we need to\nspecify how it transforms (e.g. say how a vector\n\ufb01eld on S2 is rotated). The natural way to trans-\nform a \u03c1-\ufb01eld is via the induced representation\n\u03c0 = IndG\nH \u03c1 of G (Section 2.4), which combines\nthe action of G on the base space B and the ac-\ntion of \u03c1 on the \ufb01ber V to produce an action on\nsections of the associated bundle (See Figure 3).\nFinally, having de\ufb01ned the feature spaces and\ntheir transformation laws, we can study equiv-\nariant linear maps between them (Section 3). In\nSec. 4\u20136 we cover implementation aspects, re-\nlated work, and concrete examples, respectively.\n\nFigure 3: \u03a6 maps scalar \ufb01elds to vector \ufb01elds,\nand is equivariant to the induced representation\n\u03c0i = IndSO(3)\n\nSO(2) \u03c1i.\n\n2\n\n\f2 Convolutional Feature Spaces\n\n2.1 Fiber Bundles\n\nIntuitively, a \ufb01ber bundle is a parameterization of a set of isomorphic spaces (the \ufb01bers) by another\nspace (the base). For example, we can think of a feature space in a classical CNN as a set of vector\nspaces Vx \u2243 Rn (n being the number of channels), one per position x in the plane [2]. This is an\nexample of a trivial bundle, because it is simply the Cartesian product of the plane and Rn. General\n\ufb01ber bundles are only locally trivial, meaning that they locally look like a product while having a\ndifferent global topological structure.\n\nThe simplest example of a non-trivial bundle is the Mobius strip, which\nlocally looks like a product of the circle (the base) with a line segment\n(the \ufb01ber), but is globally distinct from a cylinder (see Fig. 4). A more\npractically relevant example is given by the tangent bundle of the sphere\n(Fig. 2), which has as base space S2 and \ufb01bers that look like R2, but is\ntopologically distinct from S2 \u00d7 R2 as a bundle.\n\nFigure 4: Cylinder and\nM\u00f6bius strip\n\nFormally, a bundle consists of topological spaces E (total space), B (base space), F (canonical\n\ufb01ber), and a projection map p : E \u2192 B, satisfying a local triviality condition. Basically, this\ncondition says that locally, the bundle looks like a product U \u00d7 F of a piece U \u2286 B of the base\nspace, and F the canonical \ufb01ber. Formally, the condition is that for every a \u2208 E, there is an\nopen neighbourhood U \u2286 B of p(a) and a homeomorphism \u03d5 : p\u22121(U ) \u2192 U \u00d7 F so that the\nproj1\u2212\u2212\u2212\u2192 U agrees with p : p\u22121(U ) \u2192 U (where proj1(u, f ) = u). The\nmap p\u22121(U )\nhomeomorphism \u03d5 is said to locally trivialize the bundle above the trivializing neighbourhood U .\n\n\u03d5\n\u2212\u2192 U \u00d7 F\n\n\u22121(x) is F , and \u03d5 is a homeomorphism, we\nConsidering that for any x \u2208 U the preimage proj1\nsee that the preimage Fx = p\u22121(x) for x \u2208 B is also homeomorphic to F . Thus, we call Fx the\n\ufb01ber over x, and see that all \ufb01bers are homeomorphic. Knowing this, we can denote a bundle by its\nprojection map p : E \u2192 B, leaving the canonical \ufb01ber F implicit.\n\nVarious more re\ufb01ned notions of \ufb01ber bundle exist, each corresponding to a different kind of \ufb01ber. In\nthis paper we will work with principal bundles (bundles of groups) and vector bundles (bundles of\nvector spaces).\n\nA section s of a \ufb01ber bundle is an assignment to each x \u2208 B of an element s(x) \u2208 Fx. Formally, it is\na map s : B \u2192 E that satis\ufb01es p\u25e6s = idB. If the bundle is trivial, a section is equivalent to a function\nf : B \u2192 F , but for a non-trivial bundle we cannot continuously align all the \ufb01bers simultaneously,\nand so we must keep each s(x) in its own \ufb01ber Fx. Nevertheless, on a trivializing neighbourhood\nU \u2286 B, we can describe the section as a function sU : U \u2192 F , by setting \u03d5(s(x)) = (x, sU (x)).\n\n2.2 G as a Principal H-Bundle\n\nRecall (Sec. 1.1) that with every feature space of a G-CNN is associated a homogeneous space\nB (e.g. the sphere, projective space, hyperbolic space, Grassmann & Stiefel manifolds, etc.), and\nrecall further that such a space has a stabilizer subgroup H = {g \u2208 G | go = o} (this group being\nindependent of origin o up to isomorphism). As discussed in Appendix A, the cosets gH of H (e.g.\nthe circles in Fig. 1) partition G, and the set of cosets, denoted G/H (e.g. the sphere in Fig. 1), can\nbe identi\ufb01ed with B (up to a choice of origin).\n\nIt is this partitioning of G into cosets that induces a special kind of bundle structure on G. The\nprojection map that de\ufb01nes the bundle structure sends an element g \u2208 G to the coset gH it belongs\nto. Thus, it is a map p : G \u2192 G/H, and we have a bundle with total space G, base space G/H and\ncanonical \ufb01ber H. Intuitively, this allows us to think of G as a base space G/H with a copy of H\nattached at each point x \u2208 G/H. The copies of H are glued together in a potentially twisted manner.\n\nThis bundle is called a principal H-bundle, because we have a transitive and \ufb01xed-point free group\naction G \u00d7 H \u2192 G that preserves the \ufb01bers. This action is given by right multiplication, g 7\u2192 gh,\nwhich preserves \ufb01bers because p(gh) = ghH = gH = p(g). That is, by right-multiplying an\nelement g \u2208 G by h \u2208 H, we get an element gh that is in general different from g but is still within\nthe same coset (i.e. \ufb01ber). That the action is transitive and free on cosets follows immediately from\nthe group axioms.\n\n3\n\n\fOne can think of a principal bundle as a bundle of generalized frames or gauges relative to which\ngeometrical quantities can be expressed numerically. Under this interpretation the \ufb01ber at x is a space\nof generalized frames, and the action by H is a change of frame. For instance, each point on the\ncircles in Fig. 1 can be identi\ufb01ed with a right-handed orthogonal frame, and the action of SO(2)\ncorresponds to a rotation of this frame. The group H may also include internal symmetries, such as\ncolor space rotations, which do not relate in any way to the spatial dimensions of B.\n\nIn order to numerically represent a \ufb01eld on some neighbourhood U \u2286 G/H, we need to choose a\nframe for each x \u2208 U in a continuous manner. This is formalized as a section of the principal bundle.\nRecall that a section of p : G \u2192 G/H is a map s : G/H \u2192 G that satis\ufb01es p \u25e6 s = idG/H . Since\np projects g to its coset gH, the section chooses a representative s(gH) \u2208 gH for each coset gH.\nNon-trivial principal bundles do not have continuous global sections, but we can always use a local\nsection on U \u2286 G/H, and represent a \ufb01eld on overlapping local patches covering G/H.\n\nAside from the right action of H, which turns G into a principal H-bundle, we also have a left\naction of G on itself, as well as an action of G on the base space G/H. In general, the action of\nG on G/H does not agree with the action on G, in that gs(x) 6= s(gx), because the action on G\nincludes a twist of the \ufb01ber. This twist is described by the function h : G/H \u00d7 G \u2192 H de\ufb01ned by\ngs(x) = s(gx)h(x, g) (whenever both s(x) and s(gx) are de\ufb01ned). This function will be used in\nvarious calculations below. We note for the interested reader that h satis\ufb01es the cocycle condition\nh(x, g1g2) = h(g2x, g1)h(x, g2).\n\n2.3 The Associated Vector Bundle\n\nFeature spaces are de\ufb01ned as spaces of sections of the associated vector bundle, which we will now\nde\ufb01ne. In physics, a section of an associated bundle is simply called a \ufb01eld.\n\np\nTo de\ufb01ne the associated vector bundle, we start with the principal H-bundle G\n\u2212\u2192 G/H, and\nessentially replace the \ufb01bers (cosets) by vector spaces V . The space V \u2243 Rn carries a group\nrepresentation \u03c1 of H that describes the transformation behaviour of the feature vectors in V under a\nchange of frame. These features could for instance transform as a scalar, a vector, a tensor, or some\nother geometrical quantity [2, 6, 8]. Figure 3 shows an example of a vector \ufb01eld (\u03c1(h) being a 2 \u00d7 2\nrotation matrix in this case) and a scalar \ufb01eld (\u03c1(h) = 1).\n\nThe \ufb01rst step in constructing the associated vector bundle is to take the product G \u00d7 V . In the context\nof representation learning, we can think of an element (g, v) of G \u00d7 V as a feature vector v \u2208 V and\nan associated pose variable g \u2208 G that describes how the feature detector was steered to obtain v.\nFor instance, in a Spherical CNN [10] one would rotate a \ufb01lter bank by g \u2208 SO(3) and match it with\nthe input to obtain v. If we apply a transformation h \u2208 H to g and simultaneously apply its inverse\nto v, we get an equivalent element (gh, \u03c1(h\u22121)v). In a Spherical CNN, this would correspond to a\nchange in orientation of the \ufb01lters by h \u2208 SO(2).\n\nSo in order to create the associated bundle, we take the quotient of the product G \u00d7 V by this\naction: A = G \u00d7\u03c1 V = (G \u00d7 V )/H. In other words, the elements of A are orbits, de\ufb01ned as\n[g, v] = {(gh, \u03c1(h\u22121)v) | h \u2208 H}. The projection pA : A \u2192 G/H is de\ufb01ned as pA([g, v]) = gH.\nOne may check that this is well de\ufb01ned, i.e. independent of the orbit representative g of [g, v] =\n[gh, \u03c1(h\u22121)v]. Thus, the associated bundle has base G/H and \ufb01ber V , meaning that locally it looks\nlike G/H \u00d7 V . We note that the associated bundle construction works for any principal H-bundle,\nnog just p : G \u2192 G/H, which suggests a direction for further generalization [11].\n\nA \ufb01eld (\u201cstack of feature maps\u201d) is a section of the associated bundle, meaning that it is a map\ns : G/H \u2192 A such that \u03c0\u03c1 \u25e6 s = idG/H . We will refer to the space of sections of the associated\nvector bundle as I. Concretely, we have two ways to encode a section: as functions f : G \u2192 V\nsubject to a constraint, and as local functions from U \u2286 G/H to V . We will now de\ufb01ne both.\n\n2.3.1 Sections as Mackey Functions\n\nThe construction of the associated bundle as a product G \u00d7 V subject to an equivalence relation\nsuggests a way to describe sections concretely: a section can be represented by a function f : G \u2192 V\nsubject to the equivariance condition\n\nf (gh) = \u03c1(h\u22121)f (g).\n\n(1)\n\n4\n\n\fSuch functions are called Mackey functions. They provide a redundant encoding of a section of A,\nby encoding the value of the section relative to any choice of frame / section of the principal bundle\nsimultaneously, with the equivariance constraint ensuring consistency.\n\nA linear combination of Mackey functions is a Mackey function, so they form a vector space, which we\nwill refer to as IG. Mackey functions are easy to work with because they allow a concrete and global\ndescription of a \ufb01eld, but their redundancy makes them unsuitable for computer implementation.\n\n2.3.2 Local Sections as Functions on G/H\n\nThe associated bundle has base G/H and \ufb01ber V , so locally, we can describe a section as an\nunconstrained function f : U \u2192 V where U \u2286 G/H is a trivializing neighbourhood (see Sec. 2.1).\nWe refer to the space of such sections as IC . Given a local section f \u2208 IC , we can encode it as a\nMackey function through the following lifting isomorphism \u039b : IC \u2192 IG:\n\n[\u039bf ](g) = \u03c1(h(g)\u22121)f (gH),\n\n[\u039b\u22121f \u2032](x) = f \u2032(s(x)),\n\n(2)\n\nwhere h(g) = h(H, g) = s(gH)\u22121g \u2208 H and s(x) \u2208 G is a coset representative for x \u2208 G/H.\nThis map is analogous to the lifting de\ufb01ned by [12] for scalar \ufb01elds (i.e. \u03c1(h) = I), and can be\nde\ufb01ned more generally for any principal / associated bundle [13].\n\n2.4 The Induced Representation\n\nThe induced representation \u03c0 = IndG\n\nH \u03c1 describes the action of G on \ufb01elds. In IG, it is de\ufb01ned as:\n[\u03c0G(g)f ](k) = f (g\u22121k).\n(3)\n\nIn IC , we can de\ufb01ne the induced representation \u03c0C on a local neighbourhood U as\n\n[\u03c0C(g)f ](x) = \u03c1(h(g\u22121, x)\u22121)f (g\u22121x).\n\n(4)\n\nHere we have assumed that h is de\ufb01ned at (g\u22121, x).\nis not, one would need to\nchange to a different section of G \u2192 G/H. One may verify, using the composition\n4 does indeed de\ufb01ne a representation of G. Moreover,\nlaw for h (Sec.\none may verify that \u03c0G(g) \u25e6 \u039b = \u039b \u25e6 \u03c0C(g), i.e.\nthey de\ufb01ne isomorphic representations.\n\nthat Eq.\n\n2.2),\n\nIf it\n\nWe can interpret Eq. 4 as follows. To transform a \ufb01eld,\nwe move the \ufb01ber at g\u22121x to x, and we apply a trans-\nformation to the \ufb01ber itself using \u03c1. This is visualized\nin Fig. 5 for a planar vector \ufb01eld. Some other exam-\nples include an RGB image (\u03c1(h) = I3), a \ufb01eld of wind\ndirections on earth (\u03c1(h) a 2 \u00d7 2 rotation matrix), a diffu-\nsion tensor MRI image (\u03c1(h) a representation of SO(3)\nacting on 2-tensors), a regular G-CNN on Z3 [14, 15]\n(\u03c1 a regular representation of H).\n\n3 Equivariant Maps and Convolutions\n\nFigure 5: The rotation of a planar vector\n\ufb01eld in two steps: moving each vector to\nits new position without changing its ori-\nentation, and then rotating the vectors.\n\nEach feature space in a G-CNN is de\ufb01ned as the space of sections of some associated vector\nbundle, de\ufb01ned by a choice of base G/H and representation \u03c1 of H that describes how the \ufb01bers\ntransform. A layer in a G-CNN is a map between these feature spaces that is equivariant to the\ninduced representations acting on them. In this section we will show that equivariant linear maps can\nalways be written as a convolution-like operation using an equivariant kernel. We will \ufb01rst derive this\nresult for the induced representation realized in the space IG of Mackey functions, and then convert\nthe result to local sections of the associated vector bundle in Section 3.2. We will assume that G is\nlocally compact and unimodular.\nConsider adjacent feature spaces i = 1, 2 with a representation (\u03c1i, Vi) of Hi \u2264 G. Let \u03c0i = IndG\nbe the representation acting on I i\n\nHi \u03c1i\n\nG can be written as\n\n(5)\n\nG. A bounded linear operator I 1\n[\u03ba \u00b7 f ](g) = ZG\n\u03ba(g, g\u2032)f (g\u2032)dg\u2032,\n\nG \u2192 I 2\n\n5\n\nf(x)f(g\u22121x)\u03c1(g)f(g\u22121x)\fusing a two-argument linear operator-valued kernel \u03ba : G \u00d7 G \u2192 Hom(V1, V2), where Hom(V1, V2)\ndenotes the space of linear maps V1 \u2192 V2. Choosing bases, we get a matrix-valued kernel.\n\nWe are interested in the space of equivariant linear maps between induced representations, de\ufb01ned\nas H = HomG(I 1, I 2) = {\u03a6 \u2208 Hom(I 1, I 2) | \u03a6\u03c01(g) = \u03c02(g)\u03a6, \u2200g \u2208 G}. In order for Eq. 5 to\nde\ufb01ne an equivariant map \u03a6 \u2208 H, the kernel \u03ba must satisfy a constraint. By (partially) resolving this\nconstraint, we will show that Eq. 5 can always be written as a cross-correlation1\nTheorem 3.1. (convolution is all you need) An equivariant map \u03a6 \u2208 H can always be written as a\nconvolution-like integral.\n\nProof. Since we are only interested in equivariant maps, we get a constraint on \u03ba. For all u, g \u2208 G:\n\n[\u03ba \u00b7 [\u03c01(u)f ]](g) = [\u03c02(u)[\u03ba \u00b7 f ]](g)\n\n\u03ba(g, g\u2032)f (u\u22121g\u2032)dg\u2032 = ZG\nZG\nZG\n\u03ba(g, ug\u2032)f (g\u2032)dg\u2032 = ZG\n\n\u03ba(u\u22121g, g\u2032)f (g\u2032)dg\u2032\n\n\u03ba(u\u22121g, g\u2032)f (g\u2032)dg\u2032\n\n\u03ba(g, ug\u2032) = \u03ba(u\u22121g, g\u2032)\n\u03ba(ug, ug\u2032) = \u03ba(g, g\u2032)\n\n\u21d4\n\n\u21d4\n\n\u21d4\n\n\u21d4\n\n(6)\n\nHence, without loss of generality, we can de\ufb01ne the two-argument kernel \u03ba(\u00b7, \u00b7) in terms of a\none-argument kernel: \u03ba(g\u22121g\u2032) \u2261 \u03ba(e, g\u22121g\u2032) = \u03ba(ge, gg\u22121g\u2032) = \u03ba(g, g\u2032).\n\nThe application of \u03ba to f thus reduces to a cross-correlation:\n\n[\u03ba \u00b7 f ](g) = ZG\n\n\u03ba(g, g\u2032)f (g\u2032)dg\u2032 = ZG\n\n\u03ba(g\u22121g\u2032)f (g\u2032)dg\u2032 = [\u03ba \u22c6 f ](g).\n\n(7)\n\n3.1 The Space of Equivariant Kernels\n\nThe constraint Eq. 6 implies a constraint on the one-argument kernel \u03ba. The space of admissible\nkernels is in one-to-one correspondence with the space of equivariant maps. Here we give three\ndifferent characterizations of this space of kernels. Detailed proofs can be found in Appendix B.\n\nTheorem 3.2. H is isomorphic to the space of bi-equivariant kernels on G, de\ufb01ned as:\n\nKG = {\u03ba : G \u2192 Hom(V1, V2) | \u03ba(h2gh1) = \u03c12(h2)\u03ba(g)\u03c11(h1),\n\n\u2200g \u2208 G, h1 \u2208 H1, h2 \u2208 H2}.\n\n(8)\n\nProof. It is easily veri\ufb01ed (see supp. mat.) that right equivariance follows from the fact that f \u2208 I 1\nG\nis a Mackey function, and left equivariance follows from the requirement that \u03ba \u22c6 f \u2208 I 2\nG should be a\nMackey function. The isomorphism is given by \u0393G : KG \u2192 H de\ufb01ned as [\u0393G\u03ba]f = \u03ba \u22c6 f .\n\nThe analogous result for the two argument kernel is that \u03ba(gh2, g\u2032h1) should be equal to\n\u03c12(h\u22121\n2 )\u03ba(g, g\u2032)\u03c11(h1) for g, g\u2032 \u2208 G, h1 \u2208 H1, h2 \u2208 H2. This has the following interesting in-\nterpretation: \u03ba is a section of a certain associated bundle. We de\ufb01ne a right-action of H1 \u00d7 H2 on\nG\u00d7G by setting (g, g\u2032)\u00b7(h1, h2) = (gh1, g\u2032h2) and a representation \u03c112 of H1 \u00d7H2 on Hom(V1, V2)\nby setting \u03c112(h1, h2)\u03a8 = \u03c12(h2)\u03a8\u03c11(h\u22121\n1 ) for \u03a8 \u2208 Hom(V1, V2). Then the constraint on \u03ba(\u00b7, \u00b7)\ncan be written as \u03ba((g, g\u2032)\u00b7(h1, h2)) = \u03c112((h1, h2)\u22121)\u03ba((g, g\u2032)). We recognize this as the condition\nof being a Mackey function (Eq. 1) for the bundle (G \u00d7 G) \u00d7\u03c112 Hom(V1, V2).\n\nThere is another another way to characterize the space of equivariant kernels:\nTheorem 3.3. H is isomorphic to the space of left-equivariant kernels on G/H1, de\ufb01ned as:\n\nKC = {\u2190\u2212\u03ba : G/H1 \u2192 Hom(V1, V2) | \u2190\u2212\u03ba (h2x) = \u03c12(h2)\u2190\u2212\u03ba (x)\u03c11(h1(x, h2)\u22121),\n\n\u2200h2 \u2208 H2, x \u2208 G/H1}\n\n(9)\n\n1As in most of the CNN literature, we will not be precise about distinguishing convolution and correlation.\n\n6\n\n\fProof. using the decomposition g = s(gH1)h1(g) (see Appendix A), we can de\ufb01ne\n\n\u03ba(g) = \u03ba(s(gH1)h1(g)) = \u03ba(s(gH1)) \u03c11(h1(g)) \u2261 \u2190\u2212\u03ba (gH1)\u03c11(h1(g)),\n\n(10)\n\nThis de\ufb01nes the lifting isomorphism for kernels, \u039bK : KC \u2192 KG. It is easy to verify that when\nde\ufb01ned in this way, \u03ba satis\ufb01es right H1-equivariance.\nWe still have the left H2-equivariance constraint from Eq. 8, which translates to \u2190\u2212\u03ba as follows (details\nin supp. mat.). For g \u2208 G, h2 \u2208 H2 and x \u2208 G/H1,\n\n\u03ba(h2g) = \u03c12(h2)\u03ba(g) \u21d4 \u2190\u2212\u03ba (h2x) = \u03c12(h2)\u2190\u2212\u03ba (x)\u03c11(h1(x, h2)\u22121).\n\n(11)\n\nTheorem 3.4. H is isomorphic to the space of H \u03b3(x)H1\n\n2\n\n-equivariant kernels on H2\\G/H1:\n\nKD = {\u00af\u03ba : H2\\G/H1 \u2192 Hom(V1, V2) | \u00af\u03ba(x) = \u03c12(h)\u00af\u03ba(x)\u03c1x\n\n1 (h)\u22121,\n\n\u2200x \u2208 H2\\G/H1, h \u2208 H \u03b3(x)H1\n\n2\n\n},\n\n(12)\n\nWhere \u03b3 : H2\\G/H1 \u2192 G is a choice of double coset representatives, and \u03c1x\nthe stabilizer H \u03b3(x)H1\n= {h \u2208 H2 | h\u03b3(x)H1 = \u03b3(x)H1} \u2264 H1, de\ufb01ned as\n\n2\n\n1 is a representation of\n\n\u03c1x\n1 (h) = \u03c11(h1(\u03b3(x)H1, h)) = \u03c11(\u03b3(x)\u22121h\u03b3(x)),\n\n(13)\n\nProof. In supplementary material. For examples, see Section 6.\n\n3.2 Local Sections on G/H\n\nWe have seen that an equivariant map between spaces of Mackey functions can always be realized as\na cross-correlation on G, and we have studied the properties of the kernel, which can be encoded as\na kernel on G or G/H1 or H2\\G/H1, subject to the appropriate constraints. When implementing\na G-CNN, it would be wasteful to use a Mackey function on G, so we need to understand what it\nmeans for \ufb01elds realized by local functions f : U \u2192 V for U \u2286 G/H1. This is done by sandwiching\nthe cross-correlation \u03ba\u22c6 : I 1\n\nG with the lifting isomorphisms \u039bi : I i\n\nC \u2192 I i\nG.\n\nG \u2192 I 2\n\n[\u039b\u22121\n\n2 [\u03ba \u22c6 [\u039b1f ]]](x) = ZG\n\n\u03ba(s2(x)\u22121s1(y))f (y)dy\n\n= ZG/H1\n\n\u2190\u2212\u03ba (s2(x)\u22121y)\u03c11(h1(s2(x)\u22121s1(y)))f (y)dy\n\n(14)\n\nWhich we refer to as the \u03c11-twisted cross-correlation on G/H1. We note that for semidirect product\ngroups, the \u03c11 factor disappears and we are left with a standard cross-correlation on G/H1 with\nan equivariant kernel \u2190\u2212\u03ba \u2208 KC . We note the similarity of this expression to gauge equivariant\nconvolution as de\ufb01ned in [11].\n\n3.3 Equivariant Nonlinearities\n\nThe network as a whole is equivariant if all of its layers are equivariant. So our theory would not be\ncomplete without a discussion of equivariant nonlinearities and other kinds of layers. In a regular\nG-CNN [1], \u03c1 is the regular representation of H, which means that it can be realized by permutation\nmatrices. Since permutations and pointwise nonlinearities commute, any such nonlinearity can be\nused. For other kinds of representations \u03c1, special equivariant nonlinearities must be used. Some\nchoices include norm nonlinearities [3] for unitary representations, tensor product nonlinearities [8],\nor gated nonlinearities where a scalar \ufb01eld is normalized by a sigmoid and then multiplied by another\n\ufb01eld [6]. Other constructions, such as batchnorm and ResNets, can also be made equivariant [1, 2].\nA comprehensive overview and comparison over equivariant nonlinearities can be found in [7].\n\n7\n\n\f4\n\nImplementation\n\nSeveral different approaches to implementing group equivariant CNNs have been proposed in the\nliterature. The implementation details thereby depend on the speci\ufb01c choice of symmetry group\nG, the homogeneous space G/H, its discretization and the representation \u03c1. In any case, since the\nequivariance constraints on convolution kernels are linear, the space of H-equivariant kernels is a\nlinear subspace of the unrestricted kernel space. This implies that it is suf\ufb01cient to solve for a basis\nof H-equivariant kernels, in terms of which any equivariant kernel can be expanded using learned\nweights.\nA case of high practical importance are equivariant CNNs on Euclidean spaces Rd. Implementations\nmostly operate on discrete pixel grids. In this case, the steerable kernel basis is typically pre-sampled\non a small grid, linearly combined during the forward pass, and then used in a standard convolution\nroutine. The sampling procedure requires particular attention since it might introduce aliasing\nartifacts [4, 6]. A more in depth discussion of an implementation of equivariant CNNs, operating on\nEuclidean pixel grids, is provided in [7]. Alternatively to processing signals on a pixel grid, signals on\nEuclidean spaces might be sampled on an irregular point cloud. In this case the steerable kernel space\nis typically implemented as an analytical function, which is subsequently sampled on the cloud [5].\n\nImplementations of spherical CNNs depend on the choice of signal representation as well. In [10],\nthe authors choose a spectral approach to represent the signal and kernels in Fourier space. The\nequivariant convolution is performed by exploiting the Fourier theorem. Other approaches de\ufb01ne\nthe convolution spatially. In these cases, some grid on the sphere is chosen on which the signal\nis sampled. As in the Euclidean case, the convolution is performed by matching the signal with a\nH-equivariant kernel, which is being expanded in terms of a pre-computed basis.\n\n5 Related Work\n\nIn Appendix D, we provide a systematic classi\ufb01cation of equivariant CNNs on homogeneous spaces,\naccording to the theory presented in this paper. Besides these references, several papers deserve\nspecial mention. Most closely related is the work of [12], whose theory is analogous to ours, but only\ncovers scalar \ufb01elds (corresponding to using a trivial representation \u03c1(h) = I in our theory). A proper\ntreatment of general \ufb01elds as we do here is more dif\ufb01cult, as it requires the use of \ufb01ber bundles and\ninduced representations. The \ufb01rst use of induced representations and \ufb01elds in CNNs is [2], and the\n\ufb01rst CNN on a non-trivial homogeneous space (the Sphere) is [16].\n\nA framework for (non-convolutional) networks equivariant to \ufb01nite groups was presented by [17], and\nequivariant set and graph networks are analyzed by [18\u201321]. Our use of \ufb01elds (with \u03c1 block-diagonal)\ncan be viewed as a formalization of convolutional capsules [22, 23]. Other related work includes\n[24\u201331]. A preliminary version of this paper appeared as [32].\n\nFor mathematical background, we recommend [13, 33\u201337]. The study of induced representations and\nequivariant maps between them was pioneered by Mackey [38\u201341], who rigorously proved results\nessentially similar to the ones in this paper, though presented in a more abstract form that may not be\neasy to recognize as having relevance to the theory of equivariant CNNs.\n\n6 Concrete Examples\n\n6.1 The rotation group SO(3) and spherical CNNs\n\nThe group of 3D rotations SO(3) is a three-dimensional manifold that can be parameterized by ZYZ\nEuler angles \u03b1 \u2208 [0, 2\u03c0), \u03b2 \u2208 [0, \u03c0] and \u03b3 \u2208 [0, 2\u03c0), i.e. g = Z(\u03b1)Y (\u03b2)Z(\u03b3), (where Z and Y\ndenote rotations around the Z and Y axes). For this example we choose H = H1 = H2 = SO(2) =\n{Z(\u03b1) | \u03b1 \u2208 [0, 2\u03c0)} as the group of rotations around the Z-axis, i.e. the stabilizer subgroup of the\nnorth pole of the sphere. A left H-coset is then a subset of SO(3) of the form\n\ngH = {Z(\u03b1)Y (\u03b2)Z(\u03b3)Z(\u03b1\u2032) | \u03b1\u2032 \u2208 [0, 2\u03c0)} = {Z(\u03b1)Y (\u03b2)Z(\u03b1\u2032) | \u03b1\u2032 \u2208 [0, 2\u03c0)}.\n\nThus, the coset space G/H is the sphere S2, parameterized by spherical coordinates \u03b1 and \u03b2. As\nexpected, the stabilizer Hx of a point x \u2208 S2 is the set of rotations around the axis through x, which\nis isomorphic to H = SO(2).\n\n8\n\n\fFigure 6: Quotients of SO(3) and SE(3).\n\nWhat about the double coset space (Appendix A.1)? The orbit of a point x(\u03b1, \u03b2) \u2208 S2 under H is a\ncircle around the Z axis at lattitude \u03b2, so the double coset space H\\G/H, which indexes these orbits,\nis the segment [0, \u03c0) (see Fig. 6).\n\nThe section s : G/H \u2192 G may be de\ufb01ned (almost everywhere) as s(\u03b1, \u03b2) = Z(\u03b1)Y (\u03b2) \u2208 SO(3),\nand \u03b3(\u03b2) = Y (\u03b2) \u2208 SO(3). Then the stabilizer H \u03b3(\u03b2)H1\nfor \u03b2 \u2208 H\\G/H is the set of Z-axis\nrotations that leave the point \u03b3(\u03b2)H1 = (0, \u03b2) \u2208 S2 invariant. For the north and south pole (\u03b2 = 0\nor \u03b2 = \u03c0), this stabilizer is all of H = SO(2), but for other points it is the trivial subgroup {e}.\n\n2\n\nThus, according to Theorem 3.4, the equivariant kernels are matrix-valued functions on the segment\n[0, \u03c0), that are mostly unconstrained (except at the poles). As functions on G/H1 (Theorem 3.3),\nthey are matrix-valued functions satisfying \u2190\u2212\u03ba (rx) = \u03c12(r)\u2190\u2212\u03ba (x)\u03c11(h1(x, r)\u22121) for r \u2208 SO(2) and\nx \u2208 S2. This says that as a function on the sphere \u2190\u2212\u03ba is determined on SO(2)-orbits {rx | r \u2208 SO(2)}\n(lattitudinal circles around the Z axis) by its value on one point of the orbit. Indeed, if \u03c1(h) = 1 is\nthe trivial representation, we see that \u2190\u2212\u03ba is constant on these orbits, in agreement with [42] who use\nisotropic \ufb01lters. For \u03c12 a regular representation of SO(2), we recover the non-isotropic method of\n[10]. For segmentation tasks, one can use a trivial representation for \u03c12 in the output layer to obtain a\nscalar feature map on S2, analogous to [43]. Other choices, such as \u03c1 the standard 2D representation\nof SO(2), would make it possible to build spherical CNNs that can process vector \ufb01elds, but this has\nnot been done yet.\n\n6.2 The roto-translation group SE(3) and 3D Steerable CNNs\n\nThe group of rigid body motions SE(3) is a 6D manifold R3 \u22ca SO(3). We choose H = H1 = H2 =\nSO(3) (rotations around the origin). A left H-coset is a set of the form gH = trH = {trr\u2032 | r\u2032 \u2208\nSO(3)} = {tr | r \u2208 SO(3)} where t is the translation component of g. Thus, the coset space G/H\nis R3. The stabilizer Hx of a point x \u2208 R3 is the set of rotations around x, which is isomorphic\nto SO(3). The orbit of a point x \u2208 R3 is a spherical shell of radius kxk, so the double coset space\nH\\G/H, which indexes these orbits, is the set of radii [0, \u221e).\n\nSince SE(3) is a trivial principal SO(3) bundle, we can choose a global section s : G/H \u2192 G by\ntaking s(x) to be the translation by x. As double coset representatives we can choose \u03b3(kxk) to\nbe the translation by (0, 0, kxk). Then the stabilizer H \u03b3(kxk)H1\nfor kxk \u2208 H\\G/H is the set of\nrotations around Z, i.e. SO(2), except for kxk = 0, where it is SO(3).\n\n2\n\nFor any representations \u03c11, \u03c12, the equivariant maps between sections of the associated vector bundle\nare given by convolutions with matrix-valued kernels on R3 that satisfy \u2190\u2212\u03ba (rx) = \u03c12(r)\u2190\u2212\u03ba (x)\u03c11(r\u22121)\nfor r \u2208 SO(3) and x \u2208 R3. This follows from Theorem 3.3 with the simpli\ufb01cation h1(x, r) = r\nfor all r \u2208 H, because SE(3) is a semidirect product (Appendix A.2). Alternatively, we can de\ufb01ne\n\u2190\u2212\u03ba in terms of \u00af\u03ba, which is a kernel on H\\G/H = [0, \u221e) satisfying \u00af\u03ba(x) = \u03c12(r)\u00af\u03ba(x)\u03c11(r) for\nr \u2208 SO(2) and x \u2208 [0, \u221e). This is in agreement with the results obtained by [6].\n\n7 Conclusion\n\nIn this paper we have developed a general theory of equivariant convolutional networks on homoge-\nneous spaces using the formalism of \ufb01ber bundles and \ufb01elds. Field theories are the de facto standard\nformalism for modern physical theories, and this paper shows that the same formalism can elegantly\ndescribe the de facto standard learning machine: the convolutional network and its generalizations.\nBy connecting this very successful class of networks to modern theories in mathematics and physics,\nour theory provides many opportunities for the development of new theoretical insights about deep\nlearning, and the development of new equivariant network architectures.\n\n9\n\n\fReferences\n\n[1] Taco S Cohen and Max Welling. Group equivariant convolutional networks. In Proceedings of\nThe 33rd International Conference on Machine Learning (ICML), volume 48, pages 2990\u20132999,\n2016.\n\n[2] Taco S Cohen and Max Welling. Steerable CNNs. In ICLR, 2017.\n\n[3] Daniel E Worrall, Stephan J Garbin, Daniyar Turmukhambetov, and Gabriel J Brostow. Har-\nmonic networks: Deep translation and rotation equivariance. In The IEEE Conference on\nComputer Vision and Pattern Recognition (CVPR), July 2017.\n\n[4] Maurice Weiler, Fred A Hamprecht, and Martin Storath. Learning steerable \ufb01lters for rotation\nequivariant CNNs. In The IEEE Conference on Computer Vision and Pattern Recognition\n(CVPR), June 2018.\n\n[5] Nathaniel Thomas, Tess Smidt, Steven Kearnes, Lusann Yang, Li Li, Kai Kohlhoff, and Patrick\nRiley. Tensor \ufb01eld networks: Rotation- and Translation-Equivariant neural networks for 3D\npoint clouds. arXiv:1802.08219 [cs.LG], 2018.\n\n[6] Maurice Weiler, Mario Geiger, Max Welling, Wouter Boomsma, and Taco Cohen. 3D steerable\nCNNs: Learning rotationally equivariant features in volumetric data. In Advances in Neural\nInformation Processing Systems (NeurIPS), 2018.\n\n[7] Maurice Weiler and Gabriele Cesa. General E(2)-Equivariant Steerable CNNs. In Advances in\n\nNeural Information Processing Systems (NeurIPS), 2019.\n\n[8] Risi Kondor. N-body networks: a covariant hierarchical neural network architecture for learning\n\natomic potentials. arXiv:1803.01588 [cs.LG], 2018.\n\n[9] Risi Kondor, Hy Truong Son, Horace Pan, Brandon Anderson, and Shubhendu Trivedi. Covari-\n\nant compositional networks for learning graphs. arXiv:1801.02144 [cs.LG], January 2018.\n\n[10] Taco S Cohen, Mario Geiger, Jonas Koehler, and Max Welling. Spherical CNNs. In International\n\nConference on Learning Representations (ICLR), 2018.\n\n[11] Taco S. Cohen, Maurice Weiler, Berkay Kicanaoglu, and Max Welling. Gauge Equivariant\nConvolutional Networks and the Icosahedral CNN. In International Conference on Machine\nLearning (ICML), 2019.\n\n[12] Risi Kondor and Shubhendu Trivedi. On the generalization of equivariance and convolution\nin neural networks to the action of compact groups. In International Conference on Machine\nLearning (ICML), 2018.\n\n[13] Mark Hamilton. Mathematical Gauge Theory: With Applications to the Standard Model of\nParticle Physics. Universitext. Springer International Publishing, 2017. ISBN 978-3-319-68438-\n3. doi: 10.1007/978-3-319-68439-0.\n\n[14] Marysia Winkels and Taco S Cohen. 3D G-CNNs for pulmonary nodule detection. In Interna-\n\ntional Conference on Medical Imaging with Deep Learning (MIDL), 2018.\n\n[15] Daniel Worrall and Gabriel Brostow. CubeNet: Equivariance to 3D rotation and translation. In\n\nEuropean Conference on Computer Vision (ECCV), 2018.\n\n[16] Taco S Cohen, Mario Geiger, Jonas Koehler, and Max Welling. Convolutional Networks for\n\nSpherical Signals. In ICML Workshop on Principled Approaches to Deep Learning, 2017.\n\n[17] Siamak Ravanbakhsh, Jeff Schneider, and Barnabas Poczos. Equivariance through Parameter-\n\nSharing. In International Conference on Machine Learning (ICML), 2017.\n\n[18] Haggai Maron, Heli Ben-Hamu, Nadav Shamir, and Yaron Lipman. Invariant and Equivariant\n\nGraph Networks. In International Conference on Learning Representations (ICLR), 2019.\n\n[19] Haggai Maron, Ethan Fetaya, Nimrod Segol, and Yaron Lipman. On the Universality of\n\nInvariant Networks. In International Conference on Machine Learning (ICML), 2019.\n\n10\n\n\f[20] Nimrod Segol and Yaron Lipman. On Universal Equivariant Set Networks. arXiv:1910.02421\n\n[cs, stat], October 2019.\n\n[21] Nicolas Keriven and Gabriel Peyr\u00e9. Universal Invariant and Equivariant Graph Neural Networks.\n\nIn Neural Information Processing Systems (NeurIPS), 2019.\n\n[22] Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. Dynamic routing between capsules. In\nI Guyon, U V Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, and R Garnett, editors,\nAdvances in Neural Information Processing Systems 30, pages 3856\u20133866. Curran Associates,\nInc., 2017.\n\n[23] Geoffrey Hinton, Nicholas Frosst, and Sara Sabour. Matrix capsules with EM routing. In\n\nInternational Conference on Learning Representations (ICLR), 2018.\n\n[24] Chris Olah. Groups and group convolutions.\n\nhttps://colah.github.io/posts/\n\n2014-12-Groups-Convolution/, 2014.\n\n[25] R Gens and P Domingos. Deep symmetry networks. In Advances in Neural Information\n\nProcessing Systems (NIPS), 2014.\n\n[26] Laurent Sifre and Stephane Mallat. Rotation, scaling and deformation invariant scattering for\ntexture discrimination. IEEE conference on Computer Vision and Pattern Recognition (CVPR),\n2013.\n\n[27] E Oyallon and S Mallat. Deep Roto-Translation scattering for object classi\ufb01cation. In IEEE\n\nConference on Computer Vision and Pattern Recognition (CVPR), pages 2865\u20132873, 2015.\n\n[28] St\u00e9phane Mallat. Understanding deep convolutional networks. Philos. Trans. A Math. Phys.\n\nEng. Sci., 374(2065):20150203, April 2016.\n\n[29] Jan J Koenderink. The brain a geometry engine. Psychol. Res., 52(2-3):122\u2013127, 1990.\n\n[30] Jan Koenderink and Andrea van Doorn. The structure of visual spaces. J. Math. Imaging Vis.,\n\n31(2):171, April 2008.\n\n[31] Jean Petitot. The neurogeometry of pinwheels as a sub-riemannian contact structure. J. Physiol.\n\nParis, 97(2-3):265\u2013309, 2003.\n\n[32] Taco S Cohen, Mario Geiger, and Maurice Weiler. Intertwiners between induced representations\n(with applications to the theory of equivariant neural networks). arXiv:1803.10743 [cs.LG],\nMarch 2018.\n\n[33] R W Sharpe. Differential Geometry: Cartan\u2019s Generalization of Klein\u2019s Erlangen Program.\n\n1997.\n\n[34] Adam Marsh. Gauge theories and \ufb01ber bundles: De\ufb01nitions, pictures, and results. July 2016.\n\n[35] G B Folland. A Course in Abstract Harmonic Analysis. CRC Press, 1995.\n\n[36] T Ceccherini-Silberstein, A Mach\u00ed, F Scarabotti, and F Tolli. Induced representations and\n\nmackey theory. J. Math. Sci., 156(1):11\u201328, January 2009.\n\n[37] David Gurarie. Symmetries and Laplacians: Introduction to Harmonic Analysis, Group Repre-\n\nsentations and Applications. Elsevier B.V., 1992.\n\n[38] George W Mackey. On induced representations of groups. Amer. J. Math., 73(3):576\u2013592, July\n\n1951.\n\n[39] George W Mackey. Induced representations of locally compact groups I. Ann. Math., 55(1):\n\n101\u2013139, 1952.\n\n[40] George W Mackey.\n\nInduced representations of locally compact groups II. the frobenius\n\nreciprocity theorem. Ann. Math., 58(2):193\u2013221, 1953.\n\n[41] George W Mackey.\n\nInduced Representations of Groups and Quantum Mechanics. W.A.\n\nBenjamin Inc., New York-Amsterdam, 1968.\n\n11\n\n\f[42] Carlos Esteves, Christine Allen-Blanchette, Ameesh Makadia, and Kostas Daniilidis. 3D object\nclassi\ufb01cation and retrieval with spherical CNNs. In European Conference on Computer Vision\n(ECCV), 2018.\n\n[43] Jim Winkens, Jasper Linmans, Bastiaan S Veeling, Taco S. Cohen, and Max Welling. Improved\nSemantic Segmentation for Histopathology using Rotation Equivariant Convolutional Networks.\nIn International Conference on Medical Imaging with Deep Learning (MIDL workshop), 2018.\n\n[44] Michael M Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheynst.\nGeometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine 34,\n2017.\n\n[45] Yann LeCun, Bernhard E Boser, John S Denker, Donnie Henderson, R E Howard, Wayne E\nHubbard, and Lawrence D Jackel. Handwritten digit recognition with a Back-Propagation\nnetwork. In D S Touretzky, editor, Advances in Neural Information Processing Systems 2, pages\n396\u2013404. Morgan-Kaufmann, 1990.\n\n[46] S Dieleman, J De Fauw, and K Kavukcuoglu. Exploiting cyclic symmetry in convolutional\n\nneural networks. In International Conference on Machine Learning (ICML), 2016.\n\n[47] Emiel Hoogeboom, Jorn W T Peters, Taco S Cohen, and Max Welling. HexaConv.\n\nIn\n\nInternational Conference on Learning Representations (ICLR), 2018.\n\n[48] Yanzhao Zhou, Qixiang Ye, Qiang Qiu, and Jianbin Jiao. Oriented response networks. In CVPR,\n\n2017.\n\n[49] Erik J Bekkers, Maxime W Lafarge, Mitko Veta, Koen A J Eppenhof, and Josien P W Pluim.\nRoto-Translation covariant convolutional networks for medical image analysis. In Medical\nImage Computing and Computer Assisted Intervention (MICCAI), 2018.\n\n[50] Diego Marcos, Michele Volpi, Nikos Komodakis, and Devis Tuia. Rotation equivariant vector\n\n\ufb01eld networks. In International Conference on Computer Vision (ICCV), 2017.\n\n[51] Rohan Ghosh and Anupam K Gupta. Scale steerable \ufb01lters for locally scale-invariant convolu-\n\ntional neural networks. arXiv preprint arXiv:1906.03861, 2019.\n\n[52] Daniel E Worrall and Max Welling. Deep scale-spaces: Equivariance over scale. In International\n\nConference on Machine Learning (ICML), 2019.\n\n[53] Ivan Sosnovik, Micha\u0142 Szmaja, and Arnold Smeulders. Scale-equivariant steerable networks,\n\n2019.\n\n[54] Risi Kondor, Zhen Lin, and Shubhendu Trivedi. Clebsch\u2013Gordan Nets: A Fully Fourier Space\nSpherical Convolutional Neural Network. In Conference on Neural Information Processing\nSystems (NeurIPS), 2018.\n\n[55] Brandon Anderson, Truong-Son Hy, and Risi Kondor. Cormorant: Covariant molecular neural\n\nnetworks. arXiv preprint arXiv:1906.04015, 2019.\n\n[56] Nathana\u00ebl Perraudin, Micha\u00ebl Defferrard, Tomasz Kacprzak, and Raphael Sgier. DeepSphere:\nEf\ufb01cient spherical Convolutional Neural Network with HEALPix sampling for cosmological\napplications. Astronomy and Computing 27, 2018.\n\n[57] Chiyu Jiang, Jingwei Huang, Karthik Kashinath, Prabhat, Philip Marcus, and Matthias Niessner.\nSpherical CNNs on unstructured grids. In International Conference on Learning Representations\n(ICLR), 2019.\n\n12\n\n\f", "award": [], "sourceid": 4900, "authors": [{"given_name": "Taco", "family_name": "Cohen", "institution": "Qualcomm AI Research"}, {"given_name": "Mario", "family_name": "Geiger", "institution": "EPFL"}, {"given_name": "Maurice", "family_name": "Weiler", "institution": "University of Amsterdam"}]}