{"title": "From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction", "book": "Advances in Neural Information Processing Systems", "page_first": 8537, "page_last": 8547, "abstract": "Recently, deep feedforward neural networks have achieved considerable success in modeling biological sensory processing, in terms of reproducing the input-output map of sensory neurons. However, such models raise profound questions about the very nature of explanation in neuroscience. Are we simply replacing one complex system (a biological circuit) with another (a deep network), without understanding either? Moreover, beyond neural representations, are the deep network's computational mechanisms for generating neural responses the same as those in the brain? Without a systematic approach to extracting and understanding computational mechanisms from deep neural network models, it can be difficult both to assess the degree of utility of deep learning approaches in neuroscience, and to extract experimentally testable hypotheses from deep networks. We develop such a systematic approach by combining dimensionality reduction and modern attribution methods for determining the relative importance of interneurons for specific visual computations. We apply this approach to deep network models of the retina, revealing a conceptual understanding of how the retina acts as a predictive feature extractor that signals deviations from expectations for diverse spatiotemporal stimuli. For each stimulus, our extracted computational mechanisms are consistent with prior scientific literature, and in one case yields a new mechanistic hypothesis. Thus overall, this work not only yields insights into the computational mechanisms underlying the striking predictive capabilities of the retina, but also places the framework of deep networks as neuroscientific models on firmer theoretical foundations, by providing a new roadmap to go beyond comparing neural representations to extracting and understand computational mechanisms.", "full_text": "From deep learning to mechanistic understanding in\n\nneuroscience: the structure of retinal prediction\n\nHidenori Tanaka1,2,\u2020, Aran Nayebi3, Niru Maheswaranathan3,5, Lane McIntosh3, Stephen A.\n\nBaccus4, and Surya Ganguli2,5,\u2020\n\n1Physics & Informatics Laboratories, NTT Research, Inc., East Palo Alto, CA, USA\n\n2Department of Applied Physics, Stanford University, Stanford, CA, USA\n3Neurosciences PhD Program, Stanford University, Stanford, CA, USA\n4Department of Neurobiology, Stanford University, Stanford, CA, USA\n\n5Google Brain, Google, Inc., Mountain View, CA, USA\n\n\u2020{tanaka8,sganguli}@stanford.edu\n\nAbstract\n\nRecently, deep feedforward neural networks have achieved considerable success in\nmodeling biological sensory processing, in terms of reproducing the input-output\nmap of sensory neurons. However, such models raise profound questions about the\nvery nature of explanation in neuroscience. Are we simply replacing one complex\nsystem (a biological circuit) with another (a deep network), without understanding\neither? Moreover, beyond neural representations, are the deep network\u2019s computa-\ntional mechanisms for generating neural responses the same as those in the brain?\nWithout a systematic approach to extracting and understanding computational\nmechanisms from deep neural network models, it can be dif\ufb01cult both to assess the\ndegree of utility of deep learning approaches in neuroscience, and to extract experi-\nmentally testable hypotheses from deep networks. We develop such a systematic\napproach by combining dimensionality reduction and modern attribution methods\nfor determining the relative importance of interneurons for speci\ufb01c visual computa-\ntions. We apply this approach to deep network models of the retina, revealing a\nconceptual understanding of how the retina acts as a predictive feature extractor\nthat signals deviations from expectations for diverse spatiotemporal stimuli. For\neach stimulus, our extracted computational mechanisms are consistent with prior\nscienti\ufb01c literature, and in one case yields a new mechanistic hypothesis. Thus\noverall, this work not only yields insights into the computational mechanisms\nunderlying the striking predictive capabilities of the retina, but also places the\nframework of deep networks as neuroscienti\ufb01c models on \ufb01rmer theoretical founda-\ntions, by providing a new roadmap to go beyond comparing neural representations\nto extracting and understand computational mechanisms.\n\n1\n\nIntroduction\n\nDeep convolutional neural networks (CNNs) have emerged as state of the art models of a variety\nof visual brain regions in sensory neuroscience, including the retina [1, 2], primary visual cortex\n(V1), [3, 4, 5, 6], area V4 [3], and inferotemporal cortex (IT) [3, 4]. Their success has so far been\nprimarily evaluated by their ability to explain reasonably large fractions of variance in biological\nneural responses across diverse visual stimuli. However, fraction of variance explained is not of\ncourse the same thing as scienti\ufb01c explanation, as we may simply be replacing one inscrutable black\nbox (the brain), with another (a potentially overparameterized deep network).\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fFigure 1: Deep learning models\nof the retina trained only on nat-\nural scenes reproduce an array\nof retinal phenomena with arti\ufb01-\ncial stimuli (reproduced from ref.\n[2]).\n(A) Training procedure: We ana-\nlyzed a three-layer convolutional\nneural network (CNN) model of the\nretina which takes as input a spa-\ntiotemporal natural scene movie and\noutputs a nonnegative \ufb01ring rate,\ncorresponding to a retinal ganglion\ncell response. The \ufb01rst layer con-\nsists of eight spatiotemporal convo-\nlutional \ufb01lters (i.e., cell types) with\nthe size of (15\u00d715\u00d740), the second\nlayer of eight convolutional \ufb01lters\n(8\u00d711\u00d711), and the fully connected\nlayer predicting the ganglion cells\u2019\nresponse. As previously reported\nin [2], the deep learning model re-\nproduces (B) an omitted stimulus re-\nsponse, (C) latency coding, (D) the\nmotion reversal response, and (E)\nmotion anticipation.\n\nIndeed, any successful scienti\ufb01c model of a biological circuit should succeed along three fundamental\naxes, each of which goes above and beyond the simple metric of mimicking the circuit\u2019s input-\noutput map. First, the intermediate computational mechanisms used by the hidden layers to generate\nresponses should ideally match the intermediate computations in the brain. Second, we should be able\nto extract conceptual insight into how the neural circuit generates nontrivial responses to interesting\nstimuli (for example responses to stimuli that cannot be generated by a linear receptive \ufb01eld). And\nthird, such insights should suggest new experimentally testable hypotheses that can drive the next\ngeneration of neuroscience experiments.\nHowever, it has been traditionally dif\ufb01cult to systematically extract computational mechanisms, and\nconsequently conceptual insights, from deep CNN models due to their considerable complexity\n[7, 8]. Here we provide a method to do so based on the idea of model reduction, whose goal is to\nsystematically extract a simple, reduced, minimal subnetwork that is most important in generating\na complex CNN\u2019s response to any given stimulus. Such subnetworks then both summarize compu-\ntational mechanisms and yield conceptual insights. We build on ideas from interpretable machine\nlearning, notably methods of input attribution that can decompose a neural response into a sum of\ncontributions either from individual pixels [9] or hidden neurons [10]. To achieve considerable model\nreduction for responses to spatiotemporal stimuli, we augment and combine such input attribution\nmethods with dimensionality reduction, which, for carefully designed arti\ufb01cial stimuli employed in\nneurophysiology experiments, often involves simple spatiotemporal averages over stimulus space.\nWe demonstrate the power of our systematic model reduction procedure to attain mechanistic insights\ninto deep CNNs by applying them to state of the art deep CNN models of the retina [1, 2]. The retina\nconstitutes an ideal \ufb01rst application of our methods because the considerable knowledge (see e.g. [11])\nabout retinal mechanisms for transducing spatiotemporal light patterns into neural responses enables\nus to assess whether deep CNNs successfully learn the same computational structure. In particular,\nwe obtain deep CNN models from [2] which were trained speci\ufb01cally to mimic the input-output\ntransformation from natural movies to retinal ganglion cell outputs measured in the salamander retina.\nThe model architecture involved a three-layer CNN model of the retina with ReLU nonlinearities\n(Fig. 1A). This network was previously shown [1, 2] to: (i) yield state of the art models of the retina\u2019s\nresponse to natural scenes that are almost as accurate as possible given intrinsic retinal stochasticity;\n(ii) possess internal subunits with similar response properties to those of retinal interneurons, such as\n\n2\n\n\u0394txy\u0394txMotion reversalt [s]Schwartz et al. (2007)Rate [Hz]xLatency codingGollisch & Meister (2008)Figure 1Training input Natural scene movieTraining output Ganglion cells\u2019 responseExperimental dataOmitted stimulus responseModel outputSchwartz et al. (2007)sRate [Hz]t [s]Training procedure: natural scenesTesting procedure: structured stimulusMotion anticipationBerry et al. (1999)Position(A)(B)(C)(D)(E)N. Maheswaranathan et al. (2018)\fbipolar and amacrine cell types; (iii) generalize from natural movies, to a wide range of eight different\nclasses of arti\ufb01cially structured stimuli used over decades of neurophysiology experiments to probe\nretinal response properties. This latter generalization capacity from natural movies to arti\ufb01cially\nstructured stimuli (that were never present in the training data) is intriguing given the vastly different\nspatiotemporal statistics of the arti\ufb01cial stimuli versus natural stimuli, suggesting the arti\ufb01cial stimuli\nwere indeed well chosen to engage the same retinal mechanisms engaged under natural vision [2].\nHere, we focus on understanding the computational mechanisms underlying the deep CNN\u2019s ability\nto reproduce the neural responses to four classes of arti\ufb01cial stimuli (Fig. 1B-E), each of which,\nthrough painstaking experiments and theory, have revealed striking nonlinear retinal computations\nthat advanced our scienti\ufb01c understanding of the retina. The \ufb01rst is the omitted stimulus response\n(OSR) [12, 13] (Fig. 1B), in which a periodic sequence of full \ufb01eld \ufb02ashes entrains a retinal ganglion\ncell to respond periodically, but when a single \ufb02ash is omitted, the ganglion cell produces an even\nlarger response at the expected time of the response to the omitted \ufb02ash. Moreover, the timing of this\nomitted stimulus response occurs at the expected time over a range of frequencies of the periodic\n\ufb02ash train, suggesting the retina is somehow retaining a memory trace of the period of the train of\n\ufb02ashes. The second is latency encoding [14], in which stronger stimuli yield earlier responses (Fig.\n1C). The third is motion reversal [15], in which a bar suddenly reversing its motion near a ganglion\ncell receptive \ufb01eld generates a much larger response after the motion reversal (Fig. 1D). The fourth\nis motion anticipation [16], where the neural population responding to a moving bar is advanced in\nthe direction of motion to compensate for propagation delays through the retina (Fig. 1E). These\nresponses are striking because they imply the retina has implicitly built into it a predictive world\nmodel codifying simple principles like temporal periodicity, and the velocity based extrapolation of\nfuture position. The retina can then use these predictions to improve visual processing (e.g. in motion\nanticipation), or when these predictions are violated, the retina can generate a large response to signal\nthat deviation (e.g. in the OSR and motion reversal).\nWhile experimentally motivated prior theoretical models have been employed to explain the OSR\n[17, 18], latency encoding [14, 19], motion reversal [20, 21], and motion anticipation [16], to date,\nno single model other than the deep CNN found in [2] has been able to simultaneously account for\nretinal ganglion cell responses to both natural scenes and all four of these classes of stimuli, as well\nas several other classes of arti\ufb01cial stimuli. However, it is dif\ufb01cult to explain the computational\nmechanisms underlying the deep CNN\u2019s ability to generate these responses simply by examining the\ncomplex network in Fig. 1A. For example, why does the deep CNN \ufb01re more when a stimulus is\nomitted, or when a bar reverses? How can it anticipate motion to compensate for propagation delays?\nAnd why do stronger responses cause earlier \ufb01ring?\nThese are foundational scienti\ufb01c questions about the retina whose answers require conceptual insights\nthat are not afforded by the existence of a complex but highly predictive CNN alone. And even more\nimportantly, if we could extract conceptual insights into the computational mechanisms underlying\nCNN responses, would these mechanisms match those used in the biological retina? Or is the\ndeep CNN only accurate at the level of modelling the input-output map of the retina, while being\nfundamentally inaccurate at the level of underlying mechanisms? Adjudicating between these two\npossibilities is essential for validating whether the deep learning approach to modelling in sensory\nneuroscience can indeed succeed in elucidating biological neural mechanisms, which has traditionally\nbeen the gold-standard of circuit based understanding in systems neuroscience [11, 22, 23, 24].\nIn the following we will show how a combination of dimensionality reduction and hidden neuron or\nstimulus attribution can yield simpli\ufb01ed subnetwork models of the deep CNNs response to stimuli,\n\ufb01nding models that are consistent with prior mechanistic models with experimental support in the\ncase of latency encoding, motion reversal, and motion anticipation. In addition, our analysis yields a\nnew model that cures the inadequacies of previous models of the OSR. Thus our overall approach\nprovides a new roadmap to extract mechanistic insights into deep CNN function, con\ufb01rms in the\ncase of the retina that deep CNNs do indeed learn computational mechanisms that are similar to\nthose used in biological circuits, and yields a new experimentally testable hypothesis about retinal\ncomputation. Moreover, our results in the retina yield hope (to be tested in future combined theory\nand experiments) that more complex deep CNN models of higher visual cortical regions, may not only\nyield accurate black box models of input-output transformations, but may also yield veridical and\ntestable hypotheses about intermediate computational mechanisms underlying these transformations,\nthereby potentially placing deep CNN models of sensory brain regions on \ufb01rmer epistemological\nfoundations.\n\n3\n\n\f2 From deep CNNs to neural mechanisms through model reduction\n\nTo extract understandable reduced models from the millions of parameters comprising the deep CNN\nin Fig. 1A and [2], we \ufb01rst reduce dimensionality by exploiting spatial invariance present in the\narti\ufb01cial stimuli carefully designed to speci\ufb01cally probe retinal physiology (Fig.1B-E), and then carve\nout important sub-circuits using modern attribution methods [9, 10]. We proceed in 3 steps:\nStep (1): Quantify the importance of a model unit with integrated gradients. The nonlinear\ninput-output map of our deep CNN can be expressed as r(t) = F[s(t)], where r(t) \u2208 R+ denotes\nthe nonnegative \ufb01ring rate of a ganglion cell at time bin t and s(t) \u2208 R50\u00d750\u00d740 denotes the recent\nspatiotemporal history of the visual stimulus spanning two dimensions of space (x, y) (with 50\nspatial bins in each dimension) as well as 40 preceding time bins parameterized by \u2206t. Thus\na single component of the vector s(t) is given by sxy\u2206t(t), which denotes the stimulus contrast\nat position (x, y) at time t \u2212 \u2206t. We assume a zero contrast stimulus yields no response (i.e.\nF[0] = 0). We can decompose, or attribute the response r(t) to each preceding spacetime point by\nconsidering a straight path in spatiotemporal stimulus space from the zero stimulus to s(t) given\nby s(t; \u03b1) = \u03b1s(t) where the path parameter \u03b1 ranges from 0 to 1 [9]. Using the line integral\nF[s(t; 1)] =(cid:82) 1\n\n\u2202\u03b1 , we obtain\n\n\u2202s(cid:12)(cid:12)s(t,\u03b1) \u00b7 \u2202s(t,\u03b1)\n0 d\u03b1 \u2202F\n50(cid:88)x=1\n40(cid:88)\u2206t=1\n\n50(cid:88)y=1\n\nr(t) = F[s(t)] =\n\nsxy\u2206t(t)(cid:90) 1\n\n0\n\nd\u03b1\n\n\u2202F\n\n\u2202sxy\u2206t(t)(cid:12)(cid:12)(cid:12)(cid:12)\u03b1s(t) \u2261\n\n50(cid:88)x=1\n\n50(cid:88)y=1\n\n40(cid:88)\u2206t=1\n\nAxy\u2206t.\n\n(1)\n\n\u2202s(t)(cid:12)(cid:12)s=0 \u00b7 s(t) is only approximate. The coef\ufb01cient vector \u2202F\n\nThis equation represents an exact decomposition of the response r(t) into attributions Axy\u2206t from\neach preceding spacetime stimulus pixel (x, y, \u2206t). Intuitively, the magnitude of Axy\u2206t tells us\nhow important each pixel is in generating the response, and the sign tells us whether or not turning\non each pixel from 0 to sxy\u2206t(t) yields a net positive or negative contribution to r(t). When F\nis linear, this decomposition reduces to a Taylor expansion of F about s(t) = 0. However, in\nthe nonlinear case, this decomposition has the advantage that it is exact, while the linear Taylor\nexpansion r(t) \u2248 \u2202F\nof this Taylor\nexpansion is often thought of as the linear spacetime receptive \ufb01eld (RF) of the model ganglion cell,\na concept that dominates sensory neuroscience. Thus choosing to employ this attribution method\nenables us to go beyond the dominant but imperfect notion of an RF, in order to understand nonlinear\nneural responses to arbitrary spatiotemporal stimuli. In supplementary material, we discuss how this\ntheoretical framework of attribution to input space can be temporally extended to answer different\nquestions about how deep networks process spatiotemporal inputs.\nHowever, since our main focus here is model reduction, we consider instead attributing the ganglion\ncell response back to the \ufb01rst layer of hidden units, to quantify their importance. We denote by\ncxy(t) = W [1]\nz[1]\ncxy (cid:126) s(t) + bcxy the pre-nonlinearity activation of the layer 1 hidden units, where\nWcxy and bcxy are the convolutional \ufb01lters and biases of a unit in channel c (c = 1, . . . , 8) at\nconvolutional position (x, y) (with x, y = 1, . . . , 36). Now computing the line integral F[s(t; 1)] =\n(cid:82) 1\n0 d\u03b1 \u2202F\n\n\u2202\u03b1 over the same stimulus path employed in (1) yields\n\n\u2202s(t)(cid:12)(cid:12)s=0\n\n\u2202z[1](cid:12)(cid:12)s(t,\u03b1) \u00b7 \u2202z[1]\nr(t) = (cid:88)x,y,c(cid:34)(cid:90) 1\n\nd\u03b1\n\n0\n\n\u2202F\n\u2202z[1]\n\ncxy(cid:12)(cid:12)(cid:12)(cid:12)s(t,\u03b1)(cid:35) (W [1]\n\ncxy (cid:126) s) = (cid:88)x,y,c\n\n[Gcxy(s)] (W [1]\n\ncxy (cid:126) s) = (cid:88)x,y,c\n\nAcxy.\n\n(2)\n\nThis represents an exact decomposition of the response r(t) into attributions Acxy from each subunit\nat the same time t (since all CNN \ufb01lters beyond the \ufb01rst layer are purely spatial). This attribution\nfurther splits into a product of W [1]\ncxy (cid:126) s, re\ufb02ecting the activity of that subunit originating from\nspatiotemporal \ufb01ltering of the preceding stimulus history, and an effective stimulus dependent weight\nGcxy(s) from each subunit to the ganglion cell, re\ufb02ecting how variations in subunit activity z[1]\ncxy\nas the stimulus is turned on from 0 to s(t) yield a net impact on the response r(t). A positive\n(negative) effective weight indicates that increasing subunit activity along the stimulus path yields a\nnet excitatory (inhibitory) effect on r(t).\nStep (2): Exploiting stimulus invariances to reduce dimensionality. The attribution of the re-\nsponse r(t) to \ufb01rst layer subunits in (2) still involves 8 \u00d7 36 \u00d7 36 = 10, 368 attributions. We can,\nhowever, leverage the spatial uniformity of arti\ufb01cial stimuli used in neurophysiology experiments to\n\n4\n\n\freduce this dimensionality. For example, in the OSR and latency coding, stimuli are spatially uniform,\nimplying W [1]\ncxy (cid:126) s is independent of spatial indices (x, y). Thus, we can reduce the\nnumber of attributions to the number of channels via\n\nc (cid:126) s \u2261 W [1]\n\nr(t) =\n\n8(cid:88)c=1(cid:18) 36(cid:88)x=1\n\n36(cid:88)y=1\n\nGcxy(s)(cid:19) \u00b7 (W [1]\n\nc (cid:126) s) \u2261\n\n8(cid:88)c=1\n\nGc(s) \u00b7 (W [1]\n\nc (cid:126) s) \u2261\n\n8(cid:88)c=1\n\nAc.\n\n(3)\n\nFor the moving bar in both motion reversal and motion anticipation, W [1]\nindependent of the y index and we can reduce the dimensionality from 10,368 down to 288 by\n\ncxy (cid:126) s is\n\nr(t) =\n\n8(cid:88)c=1\n\n36(cid:88)x=1(cid:18) 36(cid:88)y=1\n\nGcxy(s)(cid:19) \u00b7 (W [1]\n\ncx (cid:126) s) \u2261\n\n8(cid:88)c=1\n\n36(cid:88)x=1\n\ncx (cid:126) s \u2261 W [1]\n36(cid:88)x=1\n\n8(cid:88)c=1\n\nGcx(s) \u00b7 (W [1]\n\ncx (cid:126) s) \u2261\n\n(4)\n\nAcx.\n\nMore generally for other stimuli with no obvious spatial invariances, one could still attempt to reduce\ndimensionality by performing PCA or other dimensionality reduction methods on the space of hidden\nunit pre-activations or attributions over time. We leave this intriguing direction for future work.\nStep (3): Building reduced models from important subunits. Finally, we can construct minimal\ncircuit models by \ufb01rst identifying \u201cimportant\u201d units de\ufb01ned as those with large magnitude attributions\nA. We then construct our reduced model as a one hidden layer neural network composed of only\nthe important hidden units, with effective connectivity from each hidden unit to the ganglion cell\ndetermined by the effective weights G in (2), (3), or (4).\n3 Results: the computational structure of retinal prediction\n\nWe now apply the systematic model reduction steps described in the previous section to each of\nthe retinal stimuli in Fig. 1B-E. We show that in each case the reduced model yields scienti\ufb01c\nhypotheses to explain the response, often consistent with prior experimental and theoretical work,\nthereby validating deep CNNs as a method for veridical scienti\ufb01c hypothesis generation in this setting.\nMoreover, our approach yields integrative conceptual insights into how these diverse computations\ncan all be simultaneously produced by the same set of hidden units.\n\n3.1 Omitted stimulus response\n\nAs shown in Fig. 1B, periodic stimulus \ufb02ashes trigger delayed periodic retinal responses. However,\nwhen this periodicity is violated by omitting a \ufb02ash, the ganglion cell signals the violation with a\nlarge burst of \ufb01ring [25, 26]. This OSR phenomenon is observed across several species including\nsalamander [12, 13]. Interestingly, for periodic \ufb02ashes in the range of 6-12Hz, the latency between\nthe last \ufb02ash before the omitted one, and the burst peak in the response, is proportional to the period\nof the train of \ufb02ashes [12, 13], indicating the retina retains a short memory trace of this period.\nMoreover, pharmacological experiments suggest ON bipolar cells are required to produce the OSR\n[13, 17], which have been shown to correspond to the \ufb01rst layer hidden units in the deep CNN [1, 2].\nThese phenomena raise two fundamental questions: what computational mechanism causes the large\namplitude burst, and how is the timing of the peak sensitive to the period of the \ufb02ashes? There are\ntwo theoretical models in the literature that aim to answer these questions. One proposes that the\nbipolar cell activity responds to each individual \ufb02ash with an oscillatory response whose period\nadapts to the period of the \ufb02ash train [18]. However, recent direct recordings of bipolar cells suggest\nthat such period adaptation is not present [27]. The other model claims that having dual pathways of\nON and OFF bipolar cells are enough to reproduce most of the aspects of the phenomena observed in\nexperiments [17]. However, the model only reproduces the shift of the onset of the burst, and not a\nshift in the peak of the burst, which has the critical predictive latency [18].\nDirect model reduction (Fig. 2) of the deep CNN in Fig. 1A using the methods of section 2 yields a\nmore sophisticated model than any prior model, comprised of three important pathways that combine\none OFF temporal \ufb01lter with two ON temporal \ufb01lters. Unlike prior models, the reduced model\nexhibits a shift in the peak of the OSR burst as a function of the frequency of input \ufb02ashes.\n\n5\n\n\fFigure 2: Omitted stimulus response. (A-1,2) Schematics of the model reduction procedure by\nonly leaving three (1 OFF, 2 ON) highly contributing units. (B) Attribution for each of the cell types\nAc over time. (C) Effective stimulus dependent weight for each of the cell types Gc over time. (D)\nThe combination of the two pathways of \ufb01lter 2 and 6 reproduces the period dependent latency. (E)\nTwo ON bipolar cells are necessary to capture the predictive latency. Cell 2 with earlier peak is\nonly active in a high-frequency regime, while the cell 6 with later peak is active independent of the\nfrequency.\n\nFig. 2A presents a schematics of the model reduction steps described in (3). We \ufb01rst attribute a\nganglion cell\u2019s response to 8 individual channels and then average across both spatial dimensions\n(Fig. 2A-1) as in steps (1) and (2). Then we build a reduced model from the identi\ufb01ed important\nsubunits that capture essential features of the omitted stimulus response phenomenon (Fig. 2A-2).\nIn Fig. 2B, we present the time dependence of the attribution Ac(s(t)) in (3) for the eight channels,\nor cell-types. Red (blue) traces re\ufb02ect positive (negative) attributions. Channel temporal \ufb01lters are\nto the left of each attribution row and the total output response r(t) is on the top row. The stimulus\nconsists of three \ufb02ashes, yielding three small responses, and then one large response after the end of\nthe three \ufb02ashes (grey line). Quantitatively, we identify that cell-type 3 dominantly explains the small\nresponses to preceding \ufb02ashes, while cell types 2 and 6 are necessary to explain the large burst after\nthe \ufb02ash train ends. The \ufb01nal set of units included in the reduced model should be the minimal set\nrequired to capture the de\ufb01ning features of the phenomena of interest. In the case of omitted stimulus\nresponse, the de\ufb01ning feature is the existence of the large amplitude burst whose peak location is\nsensitive to the period of the applied \ufb02ashes. Once we identify the set of essential temporal \ufb01lters,\nwe then proceed to determine the sign and magnitude of contribution (excitatory or inhibitory) of\nthe cell types. In Fig. 2C, we present the time-dependent effective weights from Gc(s(t)) in (3) for\nthe eight cell types, or channels. Red (blue) re\ufb02ects positive (negative) weights. Given the product\nof the temporal \ufb01lters and the weights, cell-types 2 and 6 are effectively ON cells, which cause\npositive ganglion cell responses to contrast increments, while cell-type 3 is an OFF cell, which is a\ncell type that causes positive responses to contrast decrements. Following the prescribed procedures,\ncarving out the 3 important cell-types and effective weights yields a novel, mechanistic three pathway\nmodel of the OSR, with 1 OFF and 2 ON pathways. Unlike prior models, the reduced model exhibits\na shift in the peak of the OSR burst as a function of the frequency of input \ufb02ashes (with dark to\nlight blue indicating high to low frequency variation in the \ufb02ash train) as in Fig. 2D. Furthermore,\nthe reduced model is consistent across the frequency range that produces the phenomena. Finally,\nmodel reduction yields conceptual insights into how cell-types 2 and 6 enable the timing of the burst\npeak to remember the period of the \ufb02ash train (Fig. 2E). The top row depicts the decomposition\nof the overall burst response r(t) (grey) into time dependent attributions A2 (red) and A6 (blue),\nobeying the relation r(t) \u2248 A2 + A6. Cell-type 2, which has an earlier peak in its temporal \ufb01lter,\npreferentially causes ganglion cell responses in high-frequency \ufb02ash trains (left) compared to low\nfrequency trains (right), while cell-type 6 is equally important in both. The middle row shows the\ntemporal \ufb01lter Wc=2(\u2206t), which has an earlier peak with a long tail, enabling it to integrate across\nnot only the last \ufb02ash, but also preceding \ufb02ashes (yellow bars). Time increases into the past from\nleft to right. Thus, the activation of this cell type 2 decreases as the \ufb02ash train frequency decreases,\nexplaining the decrease in attribution in the top row. The bottom row shows that the temporal \ufb01lter\nWc=6(\u2206t) of cell type 6, in contrast, has a later peak with a rapidly decaying tail. Thus the temporal\nconvolution Wc=6(\u2206t) (cid:126) s(\u2206t) of this \ufb01lter with the \ufb02ash train is sensitive only to the last \ufb02ash, and\n\n6\n\n(A-2)Ac=6(t)timeintensityFigure 2: Omitted Stimulus Response(A-1)(B)(C)~(D)s(t)(E)12345678Ac=Gc\u00b7(W[1]c~s)r(t)=8Xc=1AcGcGc(s)W[1]c32612345678high freqlow freq26Ac=2(t)r(t)\u21e1+tttW[1]c=2~sW[1]c=6~shigh\nfreqlow\nfreqPositive Negative\fis therefore independent of \ufb02ash train frequency. The late peak and rapid tail explain why it supports\nthe response at late times independent of frequency in the top row.\nThus, our systematic model reduction approach yields a new model of the OSR that cures important\ninadequacies of prior models. Moreover, it yields a new, experimentally testable scienti\ufb01c hypothesis\nthat the OSR is an emergent property of three bipolar cell pathways with speci\ufb01c and diverse temporal\n\ufb01ltering properties.\n\n3.2 Latency coding\n\nRapid changes in contrast (which often occur\nfor example right after saccades) elicit a burst\nof \ufb01ring in retinal ganglion cells with a latency\nthat is shorter for larger contrast changes [14]\n(Fig. 1C). Moreover, pharmacological studies\ndemonstrate that both ON and OFF bipolar cells\n(corresponding to \ufb01rst layer hidden neurons in\nthe deep CNN [1, 2]) are necessary to produce\nthis phenomenon [19].\nModel reduction via (3) in section 2 reveals that\na single pair of slow ON and fast OFF pathways\ncan explain the shift in the latency Fig. 3. First,\nunder a contrast decrement, there is a strong, fast\nexcitatory contribution from the OFF pathway.\nSecond, as the magnitude of the contrast decre-\nment increases, delayed inhibition from the slow\nON pathway becomes stronger. This negative\ndelayed contribution truncates excitation from\nthe OFF pathway at late times, thereby causing a\nshift in the location of the total peak response to\nearlier times (Fig. 3). The dual pathway mech-\nanism formed by slow ON and fast OFF bipolar\ncells is consistent with all existing experimental\nfacts. Moreover, it has been previously proposed\nas a theory of latency coding [14, 19]. Thus this\nexample illustrates the power of a general nat-\nural scene based deep CNN training approach,\nfollowed by model reduction, to automatically\ngenerate veridical scienti\ufb01c hypotheses that were previously discovered only through specialized\nexperiments and analyses requiring signi\ufb01cant effort [14, 19].\n\nFigure 3: Latency coding (A) The decomposition\nof the overall response r(t) (grey) into dominant\nattributions A3(t) (blue) from an OFF pathway,\nand A2(t) (red) from an ON pathway, obeying the\nrelation r(t) \u2248 A3 + A2. Under a contrast decre-\nment, the OFF pathway activated \ufb01rst, followed by\ndelayed inhibitory input from the ON pathway. (B)\nAs the amount of contrast decrement increases (yel-\nlow bars), delayed inhibition from the ON pathway\n(red) strengthens, which cuts off the total response\nin r(t) at late times more strongly, thereby shifting\nthe location of the peak of r(t) to earlier times.\n\n3.3 Motion reversal\n\nAs shown in Fig. 1D and [15], when a moving bar suddenly reverses its direction of motion, ganglion\ncells near the reversal location exhibit a sharp burst of \ufb01ring. While a ganglion cell classically\nresponds as the bar moves through its receptive \ufb01eld (RF) center from left to right before the motion\nreversal, the sharp burst after the motion reversal does not necessarily coincide with the spatial\nre-entry of the bar into the center of the RF as it moves back from right to left. Instead, the motion\nreversal burst response occurs at a \ufb01xed temporal latency relative to the time of motion reversal,\nfor a variety of reversal locations within 110 \u00b5m of the RF center. These observations raise two\nfundamental questions: why does the burst even occur and why does it occur at a \ufb01xed latency?\nThe classical linear-nonlinear model cannot reproduce the reversal response; it only correctly re-\nproduces the initial peak associated with the initial entry of a bar into the RF center [15]. Thus a\nnonlinear mechanism is required. Model reduction of the deep CNN obtained via (4) reveals that\ntwo input channels arrayed across 1D x space can explain this response through a speci\ufb01c nonlinear\nmechanism (Fig. 4). Moreover, the second important channel revealed by model reduction yields a\ncross cell-type inhibition that explains the \ufb01xed latency (Fig. 4D). Intriguingly, this reduced model is\n\n7\n\nFigure 3: Latency Coding23inputoutputLow intensity(A)(B)High intensityA2A3r(t)\u21e1+\fFigure 4: Motion reversal of a moving bar.\n(A) Schematics of (x, t) spatiotemporal model\nreduction obtained via (4). By averaging over y we obtain 8 cell types at 36 different x positions\nyielding 288 units. Attribution values reveal that only cell types 2 and 3 play a dominant role in\nmotion reversal. (B) The properties of cell-type 3 explains the existence of the burst. On the left are\ntime-series of pre-nonlinearity activations W [1]\nc=3,x (cid:126) s of hidden units whose RF center is at spatial\nposition x. Time t = 0 indicates the time of motion reversal. The boxed region indicates the spatial\nand temporal extent of the retinal burst in response to motion reversal. The offset of the box from\ntime t = 0 indicates the \ufb01xed latency. A \ufb01xed linear combination with constant coef\ufb01cents of this\nactivation cannot explain the existence of the burst due to cancellations along the vertical x-axis in the\nboxed region. However, due to downstream nonlinearities, the effective weight coef\ufb01cients Gc=3,x\nfrom subunits to ganglion cell responses rapidly \ufb02ip in sign (middle), and generating a burst of motion\nreversal response (right). (C) Schematics of the reduced model keeping only important subunits. (D)\nAttribution contributions from the two dominant cell types A2 (in pink) and A3 (in blue), where\nAc =(cid:80)36\nx=1 Acx. With only cell-type 3, the further the reversal location is from a ganglion cell\u2019s RF\ncenter, the longer we would expect it to take to generate a reversal response. However, the inhibition\ncoming from cell type 2 increases the further away the reversal occurs, truncating the late response\nand thus \ufb01xing the latency of the motion reversal response.\n\nqualitatively consistent with a recently proposed and experimentally motivated model [20] that points\nout the crucial role of dual pathways of ON and OFF bipolar cells.\n\n3.4 Motion anticipation\n\nAs shown in Fig. 1E and [16] the retina already\nstarts to compensate for propagation delays by\nadvancing the retinal image of a moving bar\nalong the direction of motion, so that the retinal\nimage does not lag behind the instantaneous\nlocation as one might naively expect.\nModel reduction of our deep CNN reveals a\nmechanism for this predictive tracking. First,\nsince ganglion cell RFs have some spatial extent,\na moving bar naturally triggers some ganglion\ncells before entering their RF center, yielding\na leading edge of a retinal wave. What is then\nrequired for motion anticipation is some addi-\ntional motion direction sensitive inhibition that\ncuts off the lagging edge of the wave so its peak\nactivity shifts towards the leading edge. Indeed,\nmodel reduction reveals a computational mech-\nanism in which one cell type feeds an excitatory\nsignal to a ganglion cell while the other provides\ndirection sensitive inhibition that truncates the lagging edge. This model is qualitatively consistent\nwith prior theoretical models that employ such direction selective inhibition to anticipate motion [16].\n\nFigure 5: Motion anticipation of a moving bar.\nContributions from the two dominant cell types.\nA2 in pink, A3 in blue, r(t) \u2248 A2 + A3 in grey,\nwhere Ac =(cid:80)36\nx=1 Acx. Depending on the direc-\ntion of motion of a bar, activity that lags behind\nthe leading edge gets asymmetrically truncated by\nthe inhibition from the cell type 2 (pink). (A) The\nbar is moving to the right and the inhibition (pink)\nis slightly stronger on the left side. (B) the bar\nis moving to the left and the inhibition (pink) is\nstronger on the right side.\n\n8\n\nFigure 4: Motion Reversal(A)(B)(D)(C)txW[1]cx~sGcxr(t)=8Xc=136Xx=1AcxW[1]c=3,x~sGc=3,xAc=3,x=Gc=3,x\u00b7(W[1]c=3,x~s)Ac=3,x=Gc=3,x\u00b7(W[1]c=3,x~s)xA2A3r(t)\u21e1+32Figure 5: Motion Anticipation(A)(B)A2A3r(t)\u21e1+\f4 Discussion\n\nFigure 6: A uni\ufb01ed framework\nto reveal computational structure\nin the brain. We outlined an auto-\nmated procedure to go from large-\nscale neural recordings to mechanis-\ntic insights and scienti\ufb01c hypotheses\nthrough deep learning and model re-\nduction. We validate our approach\non the retina, demonstrating how\nonly three cell-types with different\nON/OFF and fast/slow spatiotempo-\nral \ufb01ltering properties can nonlin-\nearly interact to simultaneously gen-\nerate diverse retinal responses.\n\nIn summary, in the case of the retina, we have shown that complex CNN models obtained via machine\nlearning can not only mimic sensory responses to rich natural scene stimuli, but also can serve as a\npowerful and automatic mechanism for generating valid scienti\ufb01c hypotheses about computational\nmechanisms in the brain, when combined with our proposed model reduction methods (Fig. 6).\nApplying this approach to the retina yields conceptual insights into how a single model consisting\nof multiple nonlinear pathways with diverse spatiotemporal \ufb01ltering properties can explain decades\nof painstaking physiological studies of the retina. This suggests in some sense an inverse roadmap\nfor experimental design in sensory neuroscience. Rather than carefully designing special arti\ufb01cial\nstimuli to probe speci\ufb01c sensory neural responses, and generating individual models tailored to each\nstimulus, one could instead \ufb01t a complex neural network model to neural responses to a rich set of\nethologically relevant natural stimuli, and then apply model reduction methods to understand how\ndifferent parts of a single model can simultaneously account for responses to arti\ufb01cial stimuli across\nmany experiments. The interpretable mechanisms extracted from model reduction then constitute\nspeci\ufb01c hypotheses that can be tested in future experiments. Moreover, the complex model itself can\nbe used to design new stimuli, for example by searching for stimuli that yield divergent responses\nin the complex model, versus a simpler model of the same sensory region. Such stimulus searches\ncould potentially elucidate functional reasons for the existence of model complexity.\nIn future studies, it will be interesting to conduct a systematic exploration of universality and\nindividuality [28] in the outcome of model reduction procedures applied to deep learning models\nwhich recapitulate desired phenomena, but are obtained from different initializations, architectures,\nand experimental recordings. An intriguing hypothesis is that the reduced models required to explain\nspeci\ufb01c neurobiological phenomena arise as universal computational invariants across the ensemble\nof deep learning models parameterized by these various design choices, while many other aspects of\nsuch deep learning models may individually vary across these choices, re\ufb02ecting mere accidents of\nhistory in initialization, architecture and training.\nIt would also be extremely interesting to stack this model reduction procedure to obtain multilayer\nreduced models that extract computational mechanisms and conceptual insights into deeper CNN\nmodels of higher cortical regions. The validation of such extracted computational mechanisms would\nrequire further experimental probes of higher responses with carefully chosen stimuli, perhaps even\nstimuli chosen to maximize responses in the deep CNN model itself [29, 30]. Overall the success of\nthis combined deep learning and model reduction approach to scienti\ufb01c inquiry in the retina, which\nwas itself not at all a priori obvious before this work, sets a foundation for future studies to explore\nthis combined approach deeper in the brain.\n\nAcknowledgments\n\nWe thank Daniel Fisher for insightful discussions and support. We thank the Masason foundation\n(HT), grants from the NEI (R01EY022933, R01EY025087, P30-EY026877) (SAB), and the Simons,\nJames S. McDonnell foundations, and NSF Career 1845166 (SG) for funding.\n\n9\n\n1. High-throughput neural recordings2. Train deep-learning model and perform neurophysiology experiments in silico3. Identify important sub-circuits\nand derive an array of interpretable models236Figure 6 Final\fReferences\n[1] Lane McIntosh, Niru Maheswaranathan, Aran Nayebi, Surya Ganguli, and Stephen Baccus. Deep learning\nmodels of the retinal response to natural scenes. Advances in neural information processing systems, pages\n1369\u20131377, 2016.\n\n[2] Niru Maheswaranathan, Lane T. McIntosh, Hidenori Tanaka, Satchel Grant, David B. Kastner, Josh\nMelander, Luke Brezovec, Aran Nayebi, Julia Wang, Surya Ganguli, and Stephen A. Baccus. The dynamic\nneural code of the retina for natural scenes. bioRxiv, 2019. doi: 10.1101/340943.\n\n[3] Daniel L. K. Yamins, Ha Hong, Charles F. Cadieu, Ethan A. Solomon, Darren Seibert, and James J. DiCarlo.\nPerformance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings\nof the National Academy of Sciences, 111(23):8619\u20138624, 2014. doi: 10.1073/pnas.1403112111.\n\n[4] Seyed-Mahdi Khaligh-Razavi and Nikolaus Kriegeskorte. Deep supervised, but not unsupervised, models\n\nmay explain it cortical representation. PLoS computational biology, 10(11):e1003915, 2014.\n\n[5] Umut G\u00fc\u00e7l\u00fc and Marcel AJ van Gerven. Deep neural networks reveal a gradient in the complexity of\nneural representations across the ventral stream. The Journal of Neuroscience, 35(27):10005\u201310014, 2015.\n\n[6] Santiago A Cadena, George H Den\ufb01eld, Edgar Y Walker, Leon A Gatys, Andreas S Tolias, Matthias\nBethge, and Alexander S Ecker. Deep convolutional models improve predictions of macaque v1 responses\nto natural images. bioRxiv, page 201764, 2017.\n\n[7] David GT Barrett, Ari S Morcos, and Jakob H Macke. Analyzing biological and arti\ufb01cial neural networks:\n\nchallenges with opportunities for synergy? Current opinion in neurobiology, 55:55\u201364, 2019.\n\n[8] Joshua I Glaser, Ari S Benjamin, Roozbeh Farhoodi, and Konrad P Kording. The roles of supervised\n\nmachine learning in systems neuroscience. Progress in neurobiology, 2019.\n\n[9] Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In Proceedings\n\nof the 34th International Conference on Machine Learning, pages 3319\u20133328, 2017.\n\n[10] Kedar Dhamdhere, Mukund Sundararajan, and Qiqi Yan. How important is a neuron. In International\n\nConference on Learning Representations, 2019.\n\n[11] Tim Gollisch and Markus Meister. Eye smarter than scientists believed: neural computations in circuits of\n\nthe retina. Neuron, 65(2):150\u2013164, 2010.\n\n[12] Greg Schwartz, Rob Harris, David Shrom, and Michael J Berry II. Detection and prediction of periodic\n\npatterns by the retina. Nature neuroscience, 10(5):552, 2007.\n\n[13] Greg Schwartz and Michael J Berry 2nd. Sophisticated temporal pattern recognition in retinal ganglion\n\ncells. Journal of neurophysiology, 99(4):1787\u20131798, 2008.\n\n[14] Tim Gollisch and Markus Meister. Rapid neural coding in the retina with relative spike latencies. Science,\n\n319(5866):1108\u20131111, 2008.\n\n[15] Greg Schwartz, Sam Taylor, Clark Fisher, Rob Harris, and Michael J Berry II. Synchronized \ufb01ring among\n\nretinal ganglion cells signals motion reversal. Neuron, 55(6):958\u2013969, 2007.\n\n[16] Michael J Berry II, Iman H Brivanlou, Thomas A Jordan, and Markus Meister. Anticipation of moving\n\nstimuli by the retina. Nature, 398(6725):334, 1999.\n\n[17] Birgit Werner, Paul B Cook, and Christopher L Passaglia. Complex temporal response patterns with a\n\nsimple retinal circuit. Journal of neurophysiology, 100(2):1087\u20131097, 2008.\n\n[18] Juan Gao, Greg Schwartz, Michael J Berry, and Philip Holmes. An oscillatory circuit underlying the\ndetection of disruptions in temporally-periodic patterns. Network: Computation in Neural Systems, 20(2):\n106\u2013135, 2009.\n\n[19] Tim Gollisch and Markus Meister. Modeling convergent on and off pathways in the early visual system.\n\nBiological cybernetics, 99(4-5):263\u2013278, 2008.\n\n[20] Eric Y Chen, Janice Chou, Jeongsook Park, Greg Schwartz, and Michael J Berry. The neural circuit\nmechanisms underlying the retinal response to motion reversal. Journal of Neuroscience, 34(47):15557\u2013\n15575, 2014.\n\n10\n\n\f[21] Eric Y Chen, Olivier Marre, Clark Fisher, Greg Schwartz, Joshua Levy, Rava Azeredo da Silveira, and\nMichael J Berry. Alert response to motion onset in the retina. Journal of Neuroscience, 33(1):120\u2013132,\n2013.\n\n[22] David I Vaney, Benjamin Sivyer, and W Rowland Taylor. Direction selectivity in the retina: symmetry and\n\nasymmetry in structure and function. Nature Reviews Neuroscience, 13(3):194, 2012.\n\n[23] Masakazu Konishi. Coding of auditory space. Annual review of neuroscience, 26(1):31\u201355, 2003.\n\n[24] Eve Marder and Dirk Bucher. Understanding circuit dynamics using the stomatogastric nervous system of\n\nlobsters and crabs. Annu. Rev. Physiol., 69:291\u2013316, 2007.\n\n[25] Theodore H Bullock, Michael H Hofmann, Frederick K Nahm, John G New, and James C Prechtl. Event-\nrelated potentials in the retina and optic tectum of \ufb01sh. Journal of Neurophysiology, 64(3):903\u2013914,\n1990.\n\n[26] Theodore H Bullock, Sacit Karam\u00fcrsel, Jerzy Z Achimowicz, Michael C McClune, and Canan Ba\u00b8sar-\nEroglu. Dynamic properties of human visual evoked and omitted stimulus potentials. Electroencephalog-\nraphy and clinical neurophysiology, 91(1):42\u201353, 1994.\n\n[27] Nikhil Rajiv Deshmukh. Complex computation in the retina. PhD thesis, Princeton University, 2015.\n\n[28] Niru Maheswaranathan, Alex H Williams, Matthew D Golub, Surya Ganguli, and David Sussillo. Univer-\nsality and individuality in neural dynamics across large populations of recurrent networks. Advances in\nneural information processing systems, 2019.\n\n[29] Pouya Bashivan, Kohitij Kar, and James J DiCarlo. Neural population control via deep image synthesis.\n\nScience, 364(6439):eaav9436, 2019.\n\n[30] Edgar Y Walker, Fabian H Sinz, Erick Cobos, Taliah Muhammad, Emmanouil Froudarakis, Paul G Fahey,\nAlexander S Ecker, Jacob Reimer, Xaq Pitkow, and Andreas S Tolias. Inception loops discover what\nexcites neurons most using deep predictive models. Nature neuroscience, 22(12):2060\u20132065, 2019.\n\n11\n\n\f", "award": [], "sourceid": 4613, "authors": [{"given_name": "Hidenori", "family_name": "Tanaka", "institution": "Stanford"}, {"given_name": "Aran", "family_name": "Nayebi", "institution": "Stanford University"}, {"given_name": "Niru", "family_name": "Maheswaranathan", "institution": "Google Brain"}, {"given_name": "Lane", "family_name": "McIntosh", "institution": "Telsa"}, {"given_name": "Stephen", "family_name": "Baccus", "institution": "Stanford University"}, {"given_name": "Surya", "family_name": "Ganguli", "institution": "Stanford"}]}