{"title": "Comparison Against Task Driven Artificial Neural Networks Reveals Functional Properties in Mouse Visual Cortex", "book": "Advances in Neural Information Processing Systems", "page_first": 5764, "page_last": 5774, "abstract": "Partially inspired by features of computation in visual cortex, deep neural networks compute hierarchical representations of their inputs.  While these networks have been highly successful in machine learning, it is still unclear to what extent they can aid our understanding of cortical function.  Several groups have developed metrics that provide a quantitative comparison between representations computed by networks and representations measured in cortex.  At the same time, neuroscience is well into an unprecedented phase of large-scale data collection, as evidenced by projects such as the Allen Brain Observatory.  Despite the magnitude of these efforts, in a given experiment only a fraction of units are recorded, limiting the information available about the cortical representation.  Moreover, only a finite number of stimuli can be shown to an animal over the course of a realistic experiment.  These limitations raise the question of how and whether metrics that compare representations of deep networks are meaningful on these data sets.  Here, we empirically quantify the capabilities and limitations of these metrics due to limited image and neuron sample spaces.  We find that the comparison procedure is robust to different choices of stimuli set and the level of sub-sampling that one might expect in a large scale brain survey with thousands of neurons.  Using these results, we compare the representations measured in the Allen Brain Observatory in response to natural image presentations.  We show that the visual cortical areas are relatively high order representations (in that they map to deeper layers of convolutional neural networks).  Furthermore, we see evidence of a broad, more parallel organization rather than a sequential hierarchy, with the primary area VisP (V1) being lower order relative to the other areas.", "full_text": "Comparison Against Task Driven Arti\ufb01cial Neural\nNetworks Reveals Functional Organization of Mouse\n\nVisual Cortex\n\nJianghong Shi\n\nDepartment of Applied Mathematics\n\nUniversity of Washington\n\nSeattle, WA 98195\njhshi@uw.edu\n\nEric Shea-Brown\n\nDepartment of Applied Mathematics\n\nUniversity of Washington\n\nSeattle, WA 98195\n\netsb@uw.edu\n\nMichael A. Buice\n\nAllen Institute for Brain Science\n\nSeattle, WA 98109\n\nmichaelbu@alleninstitute.org\n\nAbstract\n\nPartially inspired by features of computation in visual cortex, deep neural networks\ncompute hierarchical representations of their inputs. While these networks have\nbeen highly successful in machine learning, it remains unclear to what extent they\ncan aid our understanding of cortical function. Several groups have developed met-\nrics that provide a quantitative comparison between representations computed by\nnetworks and representations measured in cortex. At the same time, neuroscience\nis well into an unprecedented phase of large-scale data collection, as evidenced\nby projects such as the Allen Brain Observatory. Despite the magnitude of these\nefforts, in a given experiment only a fraction of units are recorded, limiting the\ninformation available about the cortical representation. Moreover, only a \ufb01nite\nnumber of stimuli can be shown to an animal over the course of a realistic ex-\nperiment. These limitations raise the question of how and whether metrics that\ncompare representations of deep networks are meaningful on these datasets. Here,\nwe empirically quantify the capabilities and limitations of these metrics due to\nlimited image presentations and neuron samples. We \ufb01nd that the comparison\nprocedure is robust to different choices of stimuli set and the level of subsampling\nthat one might expect in a large-scale brain survey with thousands of neurons.\nUsing these results, we compare the representations measured in the Allen Brain\nObservatory in response to natural image presentations to deep neural network. We\nshow that the visual cortical areas are relatively high order representations (in that\nthey map to deeper layers of convolutional neural networks). Furthermore, we see\nevidence of a broad, more parallel organization rather than a sequential hierarchy,\nwith the primary area VISp (V1) being lower order relative to the other areas.\n\n1\n\nIntroduction\n\nDeep neural networks, originally inspired in part by observations of function in visual cortex, have\nbeen highly successful in machine learning [14, 6, 21], but it is less clear to what extent they can\nprovide insight into cortical function. Using coarse-grained neural activity from fMRI and MEG, it has\nbeen shown that comparing against task-driven DNNs provides insights for functional organization of\nprimates\u2019 brain areas [7, 3]. At the single-neuron level, it has been shown that deep neural networks\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fwith convolutional structure and a hierarchical architecture outperform simpler models in predicting\nsingle-neuron responses in primates\u2019 visual pathway [2, 24, 12, 23].\nTo understand the overall structure and function of cortex, we require models that describe both\nthe population representation as well as single cell properties. Arti\ufb01cial neural network models\nsuch as convolutional networks discard complexity in individual units (compared to real biological\nneurons) but provide a useful structure to model large-scale organization of cortex, e.g. by describing\nthe progressive development of speci\ufb01c feature response through successive layers of computation.\nConversely, given an arti\ufb01cial network, we can use its patterns of response as a \"yardstick\" to assess\nthe nature and complexity of representations in real neural networks. Naturally, such an assessment\nrequires a metric for comparing representations and a suitable model for comparison. Here we\nchoose such models from the family of convolutional networks. We aim to assess the complexity\nand hierarchical structure of a real cortical system relative to a computational hierarchy originally\ninspired by biological response.\nAdditionally we must choose a metric. While there exist metrics in the literature to compare\nrepresentations between models or networks, even the largest scale neuroscience experiments only\nrecord from a fraction of the population of neurons and limited imaging or recording time implies that\none can only cover a very small portion of stimulus space, raising the question of whether metrics that\ncompare representations of deep networks to those of cortical neurons are meaningful. For example,\nthe Allen Brain Observatory, despite being a massive dataset, includes only a small fraction of the\nneurons in the mouse visual cortex. Similarly, despite over three hours of imaging per experiment,\nonly 118 unique natural images are shown due to the inclusion of a diverse array of stimulus types.\nIn this work, we empirically investigate the limitations imposed on representational comparison\nmetrics due to limited presentations of stimuli and sampling of the space of units or neurons.\nSpeci\ufb01cally, given a metric M that computes a similarity score between two representations, we\nchoose a \ufb01ducial task-trained network X (such as VGG16 [21]) and ask about the robustness of\nmapping representations to depth in the network X as a measure of feature complexity, we call\nthis the X-pseudo-depth for metric M, dM\nX , of the representation. We use two metrics available in\nthe literature, the similarity-of-similarity matrix (SSM) [5] and singular value canonical correlation\nanalysis (SVCCA) [20, 17].\nFor both metrics, we compute the effect on VGG16-pseudo-depth and similarity score of the size\nof the image set and the number of units sub-sampled (as would happen, e.g. when a measurement\nprecludes access to the entire population). We \ufb01nd that although the similarity score degrades with\nsubsampling neurons, it can be well approximated with number of sampled neurons on the order of\nthousands. The pseudo-depth is also reasonably preserved with number of sampled neurons on the\norder of thousands.\nUsing these observations, we \ufb01nd that the data from the Allen Brain Observatory meets criteria\nthat allow us to use the model VGG16 as a comparison model to assess functional organization and\nfeature complexity via the similarity score and VGG16-pseudo-depth. We \ufb01nd that all regions of\nmouse visual cortex have the pseudo-depth close to the midpoint of the network, indicating that the\nrepresentations as a whole are higher-order than the \"simple\" type of cell responses that typically\nused to describe early visual layers. The primary area VISp (also called V1) is of consistently\nlower VGG16-pseudo-depth than other layers, while the higher visual areas have no clear ordering,\nsuggesting the fact that mouse visual cortex is organized in a broader, more parallel structure, a\n\ufb01nding consistent with anatomical results [25]. VISam and VISrl have such low similarity scores\nthat this may suggest an alternative function, i.e. a network trained on another task may yield more\nsimilar features.\n\n2 Methodology\nProblem Formalization and de\ufb01nitions De\ufb01ne a \u201crepresentation matrix\u201d of a system X, RX 2\nRn\u21e5m, to be the set of responses of m units or neurons to n images. Choosing a set of images, we\nchoose a model network and a similarity metric M 2{ SSM, SV CCA} and compute the VGG16-\npseudo-depth as dM\nV GG16 = argmaxi2layers of VGG16M (RX, RV GG16i). We use d\u21e4 as short hand\nV GG16, and compute the corresponding similarity score, as s\u21e4 = M (RX, RV GG16d\u21e4 ).\nnotation for dM\nOur goal is to investigate the stability of d\u21e4 and similarity score s\u21e4 under subsampling of neuron\nnumber n and both the number of images m and which images are shown, and to use these quantities\n\n2\n\n\fto study representations across different mouse cortical areas. We also provide additional results\nabout other model variants in the appendix.\nThe Allen Brain Observatory data set The Allen Brain Observatory data set [4] is a large-\nscale standardized in vivo survey of physiological activity in the mouse visual cortex, featuring\nrepresentations of visually evoked calcium responses from GCaMP6f-expressing neurons. It includes\ncortical activity from nearly 60,000 neurons collected from 6 visual areas, 4 layers, and 12 transgenic\nmouse Cre lines from 243 adult mice, in response to a range of of visual stimuli.\nIn this work, we use the population neural responses to natural image stimuli, which contains 118\nnatural images selected from three different databases (Berkeley Segmentation Dataset [16], van\nHateren Natural Image Dataset [22] and McGill Calibrated Colour Image Database [18]). The images\nwere presented for 250 ms each, with no inter-image delay or intervening \u201cgray\" image. The neural\nresponses we use are events detected from F/F using an L0 regularized deconvolution algorithm,\nwhich deconvolves pointwise events assuming a linear calcium response for each event and penalizes\nthe total number of events included in the trace [10, 11]. Full information about the experiment is\ngiven in [4].\nRepresentation matrices for mouse visual cortex areas To construct the representation matrix for\na certain mouse visual cortex area, we take the trial-averaged mean responses of the neurons in the\n\ufb01rst 500ms upon the image is shown. We group activities of neurons in different experiments for the\nsame brain area and construct the representation matrix. Note that for the Allen observatory dataset,\nthe number of images (118) is much less than the number of observed neurons.\nRepresentation matrices for DNN layers Unless explicitly stated, the representation matrices for\nDNN layers are obtained from feeding the same set of 118 images (resized to 64 by 64, see section\n4.3 below) to the DNN and collecting all the activations from a certain layer.\nTwo similarity metrics for comparing representation matrices We investigate two metrics\nsuitable for comparing representation matrices with n << m, i.e., many fewer images than neurons.\nOne is similarity of similarity matrices (SSM) [5]. Another is an extension of the recently developed\nsingular value canonical correlation analysis (SVCCA) [20, 17] to the n << m regime.\nFor the SSM metric, we calculate the Pearson correlation coef\ufb01cient between every pair of rows in one\nrepresentation matrix to get a size n by n \u201csimilarity matrix\u201d where each entry describes the similarity\nof the response to two images. Importantly, this collapses the data along the neuron dimensions,\nso that representations with different numbers of neurons can be compared. To compare the two\nsimilarity matrices, we \ufb02atten the matrices to vectors and compute the Spearman rank correlation of\ntheir elements. Like the Pearson correlation coef\ufb01cient, the rank correlation lies in the range [1, 1]\nindicating how similar (close to 1) or dissimilar (close to -1) the two representations are.\nFollowing the established approaches [20], we \ufb01rst run singular value decomposition (SVD) to reduce\nthe neuron dimension to a \ufb01xed number r which is smaller than the dimension of both representations.\nWe \ufb01x r to be the most important (largest variance) 40 dimensions for each representation. We\nthen perform a canonical correlation analysis (CCA) on the reduced representation matrices. CCA\ncompares two representation matrices by sequentially \ufb01nding orthogonal directions along which the\ntwo representations are most correlated. We can then read out the strength of similarity by looking\nat the values of the corresponding correlation coef\ufb01cients. We take the mean of the r correlation\ncoef\ufb01cients resulting from CCA as the SVCCA similarity value.\nNote that SVCCA is invariant to invertible linear transformations of the representations. SSM is\ninvariant to transformations of representations that induce monotonic transformations of the elements\nof similarity matrices. An excellent review of similarity metrics and their properties can be found\nin [13].\n\n3 Robustness of estimates of similarity score and pseudo-depth to\n\nsubsampling of images and neural units\n\nIn this section, we study the robustness of VGG16-pseudo-depth and similarity score estimates in the\nface of limited stimuli and limited access to neurons in the representation of interest. Recall that we\nhave full access to all neurons in the pretrained VGG network [21] that we are using as a \u201cyardstick\u2019.\u2019\nWe begin with the simplest possible setting: using this yardstick to measure VGG16-pseudo-depth\n\n3\n\n\fFigure 1: Testing the self-consistency of d\u21e4 by varying the number of images included in the dataset. Shown are\nSSM (top) and SVCCA (bottom) d\u21e4 computed for several layers of VGG16 (1, 7, 10, 15 from left to right) using\ndifferent numbers of stimuli from tiny ImageNet. The shaded areas denote the standard deviation computed\nfrom different randomly chosen sets of images. The shaded circles denote the layers indistinguishable from d\u21e4\n(highlighted).\n\nand similarity score of another copy of VGG16, but for which we observe only a random subsample\nof units (neurons).\nWe will show that (1) the similarity scores are robust to including only the 118 images in the Allen\nbrain observatory data set, as well as the speci\ufb01c images within this set, and (2) the similarity scores\ndecrease with neuron subsampling, whereas the pseudo-depth stays constant given enough neurons.\n\n3.1 VGG-pseudo-depth and similarity scores can be estimated stably with limited image sets\n\nThe Allen Brain Observatory dataset includes neural responses to 118 natural image stimuli. We \ufb01rst\nstudy how the number of stimuli in\ufb02uences estimates of VGG16-pseudo-depth and similarity score,\nand how much variation arises when we present different sets of images.\nFor this, we randomly select different numbers of images from tiny ImageNet and calculate the\nsimilarity values between VGG16 model layers. The results for four representative layers are shown\nin Figure 1. We see that the VGG16-pseudo-depth identi\ufb01es the corresponding layer that is chosen\nfor comparison, and the similarity score is always one for the corresponding layer given different\nnumber of randomly chosen images. In addition, the variance introduced by the random choices\nof images is small for 120 images. Thus the metrics are robust to different choices of stimuli set,\npresumably including the image set used in the Allen Brain Observatory.\nNote that the sharpness of the peak of the similarity curve represents how much the metric can\ndifferentiate one layer from another. We see that for SSM layers do not further differentiate when\nmore than 120 images are shown (approximately the number presented in the biological data set),\nwhile SVCCA values can still differentiate the layers better, with more peaked similarity curves, if\nwe add more images to the data set.\n\n3.2 VGG-pseudo-depth and similarity scores can be estimated stably with suf\ufb01cient\n\nsubsampling of neuronal populations\n\nIn biological experimental settings, we only observe a small portion of neurons from a brain area.\nHere, we investigate how this affects our ability to reliably use the VGG network to estimate pseudo-\ndepth and similarity scores. Recalling that the network that we use as a yardstick can be completely\nobserved, we take a sub-subsampled population from a certain layer in VGG16, and compare it to\nthe whole population of VGG16 layers. The results for four representative layers are given in Figure\n\n4\n\n\fFigure 2: Testing the self-consistency of d\u21e4 by varying the number of units subsampled. Shown for SSM (top)\nand SVCCA (bottom) is d\u21e4 computed for several layers of VGG16 (1, 7, 10, 15 from left to right). The shaded\nareas denote the standard deviation computed from different random draws of sub-samples.\n\n2. This shows that the similarity scores are severely reduced by subsampling. As we increase the\nnumber of neurons, the similarity curves also rise, reaching values with 2000 neurons that are close\nto those with no subsampling. Thus, at least for comparing the VGG model with a partially observed\nversion of itself, a rule of thumb is that if including at least 2000 neurons in the sampled population,\nthen the similarity score is a good approximation to those that would be found from observing the\nwhole population.\nThe relative order of similarity values across layers is consistent for a wider range of the number of\nsampled neurons. Even with less than 2000 neurons sampled, say 1000, we can already \ufb01nd which\nlayers are more similar to the population of interest. Thus, the corresponding rule of thumb for\nVGG16-pseudo-depth is that around 1000 neurons must be sampled for it to be consistently estimated.\n\n3.3 Robustness of similarity score and pseudo-depth extend to a different network\n\nTo see whether the approaches above remain robust when comparing representations from a different\nnetwork against representations generated by VGG16, we choose neurons from 4 layers of VGG19\nand compare them with entire layers of VGG16. The results are given in Figure 3. We see that the\ncurves with 2000 neurons are very close to the ones with 8000 neurons, suggesting that this remains\nan adequate level of sampling when comparing between these two networks. Moreover, our metrics\nshow that early layers in VGG19 are more similar to early layers in VGG16, and later layers in\nVGG19 are more similar to later layers in VGG16, as we would expect intuitively, re\ufb02ecting the\nfunctional hierarchy of the four VGG19 layers based on VGG16-pseudo-depth estimated from 2000\nneurons.\n\n4 VGG16-pseudo-depth and similarity scores for mouse cortex and\n\ninterpretations for the visual hierarchy\n\nIn this section, we compare mouse visual cortex representations against VGG16 and discuss the\nresulting insights for the mouse visual hierarchy. In the Allen brain observatory data set, each neuron\nbelongs to a speci\ufb01c visual area (VISp, VISl, VISal, VISpm, VISal, VISam), cortical layer (layer23,\nlayer4, layer5, layer6) and has a speci\ufb01c cell type (Cre-line). By grouping neurons in different areas,\ncortical layer or cell types, we can study the functional properties of the speci\ufb01c neuron groups. In\nthe following, we separately compare VGG16 with entire cortical areas (Figure 4), distinct cortical\nlayers in the same area (Figure 5), and distinct cell types in the same area (Figure 6).\n\n5\n\n\fFigure 3: d\u21e4 computed on the layers of VGG19. d\u21e4 is relatively consistent across with large numbers of\nsub-sampled units. Shown for SSM (top) and SVCCA (bottom) is d\u21e4 for the layers of VGG19 with different\nnumbers of sub-sampled units (left to right: 100, 2000, 8000).\n\nFigure 4: d\u21e4 computed for representations from the Allen Brain Observatory, shows a relatively broad, parallel\nstructure, rather than a strict hierarchy, although VISp is of lower d\u21e4 than the other areas. Shown for SSM\n(top) and SVCCA (bottom) is d\u21e4 for the Allen Brain Observatory. The dashed gray curve is comparing the\nwhole population to VGG16 with responses shuf\ufb02ed. The shaded areas denotethe standard deviation computed\nfrom different random draws of sub-samples. The shaded circles denote the layers indistinguishable from d\u21e4\n(highlighted).\n\n4.1 Whole brain area comparisons show functional properties for mouse visual cortex areas\n\nTo study visual representations within and across whole brain areas, we group all the neurons in\nthe same visual area and compare all six areas in our data set to VGG16. Note that different areas\nhave different total numbers of recorded neurons available. In order to make fair comparisons\nacross areas, each time we compute a similarity curve we sample the same number of neurons\nwith replacement from each area. As always, we compare representations in the sub-sampled brain\narea to representations in all neurons in the VGG16 layers that we are using as our yardstick. The\nresults are shown in Figure 4. To give a baseline for these comparisons, we shuf\ufb02ed the rows of the\nrepresentation matrices and calculate the similarity curves for it (dashed gray curves).\nSimilarity curves computed using both SSM and SVCCA metrics show that:\n\n6\n\n\f1. The pseudo-depth for the mouse brain areas corresponds to the middle layers of VGG16.\nThis shows that mouse visual cortical representations are higher-order, involving multiple\nstages of processing.\n\n2. The pseudo-depth of VISp is lower than that of other brain areas, a fact that is partially but\nnot completely attributable due to its receptive \ufb01eld size (see section 4.3 below). Meanwhile,\nthe higher visual areas have no clear ordering. This suggests that, following initial stages of\nprocessing after VISp, mouse visual cortex is organized in a broadly parallel structure, as\napposed to a hierarchical one.\n\n3. VISam and VIrl have the lowest similarity scores among all brain areas, according to both\nmetrics. Based on our studies in Section 3 (Figure 2) that suggest sub-samples of 2000\nneurons are suf\ufb01cient to approximate similarity scores, this indicates that VISam and VIrl\nare less similar overall to VGG16 than the other areas. A natural hypothesis is that VISam\nand VIrl perform a different type of processing \u2013 one that demands visual features that are\nmore distinct from those required to classify the large set of categories used to train VGG16.\n\nIn addition to these principal observations that are common to both SSM and SVCCA metrics, we\nnote that these metrics do show some different properties when applied to brain areas VISl and VISpm.\nSpeci\ufb01cally, SSM produces a relatively larger variance in similarity curves across subsamples of VISl\nand VISpm neurons, and as a consequence a broader range of possible pseudo-depths for these areas.\nWe leave investigating the cause and possible interpretations of such differences to future work.We\nalso used different input image resolutions to do the comparison (Figure 7 in appendix), it shows\nour main conclusions remain valid, but increasing image resolution cause the pseudo-depth to shift\nto the right, which suggests that the pseudo-depth could be associated with receptive \ufb01eld size. To\nnumerically quantify the effects of trial-to-trial variability, we repeated the calculation of the SSM\nvalue as in Figure 4 by bootstrapping across trials (Figure 8 in the appendix). The results show that\nour main conclusions are robust to trial-to-trial variability.\n\n4.2 Cortical layer and cell-type subpopulations show similar trends but can have higher\n\nsimilarity scores than brain areas taken as a whole\n\nHow do the trends for pseudo-depth and similarity scores that we have identi\ufb01ed above depend on\nthe fact that we have grouped together neurons across type and cortical depth (cortical layer) into\n\u2018whole\u201d areas? To answer this question, we separate neurons from the same brain area according\nto their cortical layer, Figure 5, and genetically encoded cell line (a coarse measure of cell type),\nFigure 6. In producing the resulting similarity scores we sample 2000 neurons with replacement from\neach subpopulation of mouse neurons. Note that these subpopulations have less than 2000 neurons in\ngeneral, so that resampling is signi\ufb01cant; in Figure 6, we only show the results for cell types with\nmore than 900 neurons.\nWe \ufb01nd that SVCCA reveals the same basic trends in similarity curves when brain areas are divided\ninto subpopulations as for the whole area comparisons in Section 4.1. The SSM metric produces\ncurves that are suggestive of some possible differences. For example, for the whole area comparisons,\nwe see that the SSM curves values for VISl and VISpm have lower mean and larger variance compared\nto those for VISp and VISal. However, when their subpopulations are considered separately, there\nare some cortical layers (layer23 of VISl and layer5 of VISpm) and cell types (Slc17a7 of VISpm)\nthat have higher SSM similarity scores than their areas as a whole. This suggests that these cortical\nlayers and cell types may, taken as components of a larger system, represent visual features that are\nin fact more similar to those extracted by VGG16.\n\nImpact of image resolution on VGG pseudo-depth\n\n4.3\nA natural question about our conclusions about pseudo-depth above is whether they are an automatic\nconsequence of the image resolution (sometimes referred to as the receptive \ufb01eld size) that occurs at\ndifferent stages through both the VGG network and the mouse brain \u2013 in other words, whether they\nsimply follow from matching the resolution in a given VGG layer with that in a given mouse brain\narea, rather than matching their complexity.\nTo address this, we \ufb01rst note that we have chosen our input images to be downsampled to a very\nlimited size (64 by 64) that roughly corresponds to the limited visual acuity of the mouse [19]. Thus\nwe do not believe that our overall \ufb01nding that the VGG-pseudo-depth of mouse visual brain areas\n\n7\n\n\fFigure 5: Separate cortical layer comparisons. SSM (top) and SVCCA (bottom) result for comparing different\ncortical layers in the same area to VGG16.\n\nFigure 6: Separate cell type comparisons. SSM (top) and SVCCA (bottom) result for comparing different cell\ntypes in the same area to VGG16. Only cell types with more than 900 neurons are shown.\n\ncorresponds to the middle layers of VGG is an automatic consequence of needing to look suf\ufb01ciently\ndeep into the VGG network for receptive \ufb01eld sizes that are as large as those in the mouse visual\nsystem. In the appendix, we further test this by recomputing similarity curves for the VGG network\nresponding to images with both substantially lower (resized input images to 32 by 32) and higher (128\nby 128) resolution. We \ufb01nd that there is little effect of this input resolution for SSM pseudo-depth.\nMoreover, while SVCCA pseudo-depth is impacted by input resolution, pseudo-depths remain in\nthe middle layers of SVCCA even when the input resolution is doubled or halved. Based on this we\nconclude that our result that the pseudo-depth of mouse visual cortex corresponds to the middle layers\nof VGG16 is robust to reasonable assumptions about the visual resolution. However, conclusions\nabout the relative depth of visual areas could still be impacted by the resolution issue. For example,\narea VISp is known [4] to have smaller receptive \ufb01elds than other mouse visual cortex areas. Thus,\nthe fact that SVCCA (but not SSM) pseudo-depths are earlier for VISp than other areas could be\ndue to the resolution effects, rather than the level of complexity of its representations. We note a\n\ufb01nal possible limitation in interpreting our results. The VGG16 network that we use as a yardstick\nwas pretrained on high resolution visual inputs. It is an interesting and open question whether our\n\ufb01ndings would be the same for a network retrained with the lower resolution inputs which we use\nand describe above.\n\n8\n\n\f5 Conclusion\nDeep arti\ufb01cial neural networks can now produce task behavior that rivals the performance of biological\nbrains in many settings. This opens the door to a fascinating question: what is similar, and what is\ndifferent, in the way in which arti\ufb01cial and biological networks solve the underlying tasks [23, 9]. A\nnatural place to start is in comparing the stimulus representations that each produces.\nOur \ufb01rst goal was to assess the robustness of this comparison to an unavoidable challenges: the set of\nstimuli, and number of neurons, that can be probed in biological experiments is necessarily limited.\nOur empirical results show that pseudo-depth and similarity scores are indeed robust to choices of\nstimuli on the order of hundreds and subsampling of neurons on the order of thousands.\nOur second goal was to use this comparison to investigate visual representations in the mouse visual\ncortex, a system of explosively increasing interest in the neuroscience community and for which\ncurated, massive public data sets on visual representations are now available [4]. Functionally, very\nlittle is known about the visual areas in mice, compared with the primate visual cortex. This said,\nanatomical studies are developing the inter-area wiring diagram ([8]), and functional studies have\nprovided evidence of some specialization across areas in terms of spatial and temporal frequency\nprocessing (e.g. [1, 15]). Our results with data from the Allen Brain Observatory data set show\nthat, according to VGG pseudo-depth and similarity scores, mouse visual cortical areas are relatively\nhigh order representations in a broad, more parallel organization rather than a sequential hierarchy,\nwith the primary area VISp being lower order relative to the other areas. This is consistent with the\nrelatively \ufb02at hierarchy observed in [8]. This approach and \ufb01nding invites future insights from other\narti\ufb01cial network systems, e.g. recurrent networks, and helps open doors for analyzing emerging\nlarge-scale datasets across species and tasks.\n\n6 Acknowledgements\n\nWe thank Tianqi Chen, Saskia de Vries, Michael Oliver for helpful discussions, and Rich Pang,\nGabrielle Gutierrez for comments on the draft. We thank the Allen Institute for Brain Science founder,\nPaul G. Allen, for his vision, encouragement, and support. We acknowledge the NIH Graduate\ntraining grant in neural computation and engineering (R90DA033461).\n\n9\n\n\fReferences\n[1] Mark L. Andermann, Aaron M. Kerlin, Demetris K. Roumis, Lindsey L. Glickfeld, and R. Clay Reid.\n\nFunctional specialization of mouse higher visual cortical areas. Neuron, 72(6):1025 \u2013 1039, 2011.\n\n[2] Charles F. Cadieu, Ha Hong, Daniel L. K. Yamins, Nicolas Pinto, Diego Ardila, Ethan A. Solomon, Najib J.\nMajaj, and James J. DiCarlo. Deep neural networks rival the representation of primate it cortex for core\nvisual object recognition. PLOS Computational Biology, 10(12):1\u201318, 12 2014.\n\n[3] Radoslaw Martin Cichy, Aditya Khosla, Dimitrios Pantazis, Antonio Torralba, and Aude Oliva. Comparison\nof deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals\nhierarchical correspondence. Scienti\ufb01c Reports, 6:27755 EP, Jun 2016.\n\n[4] Saskia E J de Vries, Jerome Lecoq, Michael A Buice, Peter A Groblewski, Gabriel K Ocker, Michael\nOliver, David Feng, Nicholas Cain, Peter Ledochowitsch, Daniel Millman, Kate Roll, Marina Garrett, Tom\nKeenan, Leonard Kuan, Stefan Mihalas, Shawn Olsen, Carol Thompson, Wayne Wakeman, Jack Waters,\nDerric Williams, Chris Barber, Nathan Berbesque, Brandon Blanchard, Nicholas Bowles, Shiella Caldejon,\nLinzy Casal, Andrew Cho, Sissy Cross, Chinh Dang, Tim Dolbeare, Melise Edwards, John Galbraith,\nNathalie Gaudreault, Fiona Grif\ufb01n, Perry Hargrave, Robert Howard, Lawrence Huang, Sean Jewell, Nika\nKeller, Ulf Knoblich, Josh Larkin, Rachael Larsen, Chris Lau, Eric Lee, Felix Lee, Arielle Leon, Lu Li,\nFuhui Long, Jennifer Luviano, Kyla Mace, Thuyanh Nguyen, Jed Perkins, Miranda Robertson, Sam Seid,\nEric Shea-Brown, Jianghong Shi, Nathan Sjoquist, Cliff Slaughterbeck, David Sullivan, Ryan Valenza,\nCasey White, Ali Williford, Daniela Witten, Jun Zhuang, Hongkui Zeng, Colin Farrell, Lydia Ng, Amy\nBernard, John W Phillips, R Clay Reid, and Christof Koch. A large-scale standardized physiological\nsurvey reveals functional organization of the mouse visual cortex. bioRxiv, 2018. (to appear in Nature\nNeuroscience).\n\n[5] J\u00f6rn Diedrichsen and Nikolaus Kriegeskorte. Representational models: A common framework for\nunderstanding encoding, pattern-component, and representational-similarity analysis. PLOS Computational\nBiology, 13(4):1\u201333, 04 2017.\n\n[6] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016.\n\n[7] Umut G\u00fc\u00e7l\u00fc and Marcel A. J. van Gerven. Deep neural networks reveal a gradient in the complexity of\n\nneural representations across the ventral stream. Journal of Neuroscience, 35(27):10005\u201310014, 2015.\n\n[8] Julie A Harris, Stefan Mihalas, Karla E Hirokawa, Jennifer D Whitesell, Joseph Knox, Amy Bernard,\nPhilip Bohn, Shiella Caldejon, Linzy Casal, Andrew Cho, David Feng, Nathalie Gaudreault, Nile Graddis,\nPeter A Groblewski, Alex Henry, Anh Ho, Robert Howard, Leonard Kuan, Jerome Lecoq, Jennifer Luviano,\nStephen McConoghy, Marty Mortrud, Maitham Naeemi, Lydia Ng, Seung W Oh, Benjamin Ouellette, Staci\nSorensen, Wayne Wakeman, Quanxin Wang, Ali Williford, John Phillips, Christof Koch, and Hongkui\nZeng. The organization of intracortical connections by layer and cell class in the mouse brain. bioRxiv,\n2018.\n\n[9] Olivier J. H\u00e9naff, Robbe L. T. Goris, and Eero P. Simoncelli. Perceptual straightening of natural videos.\n\nNature Neuroscience, 2019.\n\n[10] Sean Jewell and Daniela Witten. Exact spike train inference via `0 optimization. Ann. Appl. Stat.,\n\n12(4):2457\u20132482, 12 2018.\n\n[11] Sean W Jewell, Toby Dylan Hocking, Paul Fearnhead, and Daniela M Witten. Fast nonconvex deconvolu-\n\ntion of calcium imaging data. Biostatistics, 02 2019.\n\n[12] Seyed-Mahdi Khaligh-Razavi and Nikolaus Kriegeskorte. Deep supervised, but not unsupervised, models\n\nmay explain it cortical representation. PLOS Computational Biology, 10(11), 11 2014.\n\n[13] Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton. Similarity of Neural Network\nRepresentations Revisited. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of\nthe 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning\nResearch, pages 3519\u20133529, Long Beach, California, USA, 2019.\n\n[14] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classi\ufb01cation with deep convolutional\nneural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in\nNeural Information Processing Systems 25, pages 1097\u20131105. Curran Associates, Inc., 2012.\n\n[15] James H. Marshel, Marina E. Garrett, Ian Nauhaus, and Edward M. Callaway. Functional specialization of\n\nseven mouse visual cortical areas. Neuron, 72(6):1040 \u2013 1054, 2011.\n\n10\n\n\f[16] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its\napplication to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings\nEighth IEEE International Conference on Computer Vision. ICCV 2001, volume 2, pages 416\u2013423 vol.2,\n2001.\n\n[17] Ari Morcos, Maithra Raghu, and Samy Bengio. Insights on representational similarity in neural networks\nwith canonical correlation. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and\nR. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 5732\u20135741. Curran\nAssociates, Inc., 2018.\n\n[18] Adriana Olmos and Frederick A A Kingdom. A biologically inspired algorithm for the recovery of shading\n\nand re\ufb02ectance images. Perception, 33(12):1463\u20131473, 2004.\n\n[19] Glen T Prusky, Paul W.R West, and Robert M Douglas. Behavioral assessment of visual acuity in mice\n\nand rats. Vision Research, 40(16):2201 \u2013 2209, 2000.\n\n[20] Maithra Raghu, Justin Gilmer, Jason Yosinski, and Jascha Sohl-Dickstein. Svcca: Singular vector\ncanonical correlation analysis for deep learning dynamics and interpretability. In I. Guyon, U. V. Luxburg,\nS. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information\nProcessing Systems 30, pages 6076\u20136085. Curran Associates, Inc., 2017.\n\n[21] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recogni-\n\ntion. CoRR, abs/1409.1556, 2014.\n\n[22] J. H. van Hateren and A. van der Schaaf. Independent component \ufb01lters of natural images compared with\nsimple cells in primary visual cortex. Proceedings of the Royal Society of London. Series B: Biological\nSciences, 265(1394):359\u2013366, 1998.\n\n[23] Daniel L. K. Yamins and James J. DiCarlo. Using goal-driven deep learning models to understand sensory\n\ncortex. Nature Neuroscience, 19:356, Feb 2016.\n\n[24] Daniel L. K. Yamins, Ha Hong, Charles F. Cadieu, Ethan A. Solomon, Darren Seibert, and James J.\nDiCarlo. Performance-optimized hierarchical models predict neural responses in higher visual cortex.\n111(23):8619\u20138624, 2014.\n\n[25] Jun Zhuang, Lydia Ng, Derric Williams, Matthew Valley, Yang Li, Marina Garrett, and Jack Waters. An\n\nextended retinotopic map of mouse cortex. eLife, 6:e18372, Jan 2017.\n\n11\n\n\f", "award": [], "sourceid": 3090, "authors": [{"given_name": "Jianghong", "family_name": "Shi", "institution": "University of Washington"}, {"given_name": "Eric", "family_name": "Shea-Brown", "institution": "University of Washington"}, {"given_name": "Michael", "family_name": "Buice", "institution": "Allen Institute for Brain Science"}]}