{"title": "Interpreting the neural code with Formal Concept Analysis", "book": "Advances in Neural Information Processing Systems", "page_first": 425, "page_last": 432, "abstract": "We propose a novel application of Formal Concept Analysis (FCA) to neural decoding: instead of just trying to figure out which stimulus was presented, we demonstrate how to explore the semantic relationships between the neural representation of large sets of stimuli. FCA provides a way of displaying and interpreting such relationships via concept lattices. We explore the effects of neural code sparsity on the lattice. We then analyze neurophysiological data from high-level visual cortical area STSa, using an exact Bayesian approach to construct the formal context needed by FCA. Prominent features of the resulting concept lattices are discussed, including indications for a product-of-experts code in real neurons.", "full_text": "Interpreting the Neural Code with\n\nFormal Concept Analysis\n\nDominik Endres, Peter F\u00a8oldi\u00b4ak\n\nSchool of Psychology,University of St. Andrews\n\n{dme2,pf2}@st-andrews.ac.uk\n\nKY16 9JP, UK\n\nAbstract\n\nWe propose a novel application of Formal Concept Analysis (FCA) to neural de-\ncoding:\ninstead of just trying to \ufb01gure out which stimulus was presented, we\ndemonstrate how to explore the semantic relationships in the neural representation\nof large sets of stimuli. FCA provides a way of displaying and interpreting such\nrelationships via concept lattices. We explore the effects of neural code sparsity on\nthe lattice. We then analyze neurophysiological data from high-level visual corti-\ncal area STSa, using an exact Bayesian approach to construct the formal context\nneeded by FCA. Prominent features of the resulting concept lattices are discussed,\nincluding hierarchical face representation and indications for a product-of-experts\ncode in real neurons.\n\n1 Introduction\n\nMammalian brains consist of billions of neurons, each capable of independent electrical activity.\nFrom an information-theoretic perspective, the patterns of activation of these neurons can be un-\nderstood as the codewords comprising the neural code. The neural code describes which pattern of\nactivity corresponds to what information item. We are interested in the (high-level) visual system,\nwhere such items may indicate the presence of a stimulus object or the value of some stimulus at-\ntribute, assuming that each time this item is represented the neural activity pattern will be the same\nor at least similar. Neural decoding is the attempt to reconstruct the stimulus from the observed pat-\ntern of activation in a given population of neurons [1, 2, 3, 4]. Popular decoding quality measures,\nsuch as Fisher\u2019s linear discriminant [5] or mutual information [6] capture how accurately a stimu-\nlus can be determined from a neural activity pattern (e.g., [4]). While these measures are certainly\nuseful, they tell us little about the structure of the neural code, which is what we are concerned\nwith here. Furthermore, we would also like to elucidate how this structure relates to the represented\ninformation items, i.e. we are interested in the semantic aspects of the neural code.\nTo explore the relationship between the representations of related items, F\u00a8oldi\u00b4ak [7] demonstrated\nthat a sparse neural code can be interpreted as a graph (a kind of \u201dsemantic net\u201d). In this inter-\npretation, the neural responses are assumed to be binary (active/inactive). Each codeword can then\nbe represented as a set of active units (a subset of all units). The codewords can now be partially\nordered under set inclusion: codeword A \u2264 codeword B iff the set of active neurons of A is a sub-\nset of the active neurons of B. This ordering relation is capable of capturing semantic relationships\nbetween the represented information items. There is a duality between the information items and\nthe sets representing them: a more general class corresponds to a smaller subset of active neurons,\nand more speci\ufb01c items are represented by larger sets [7]. Additionally, storing codewords as sets is\nespecially ef\ufb01cient for sparse codes. The resulting graphs (lattices) are an interesting representation\nof the relationships implicit in the code.\n\n\fWe would also like to be able to represent how the relationship between sets of active neurons trans-\nlates into the corresponding relationship between the encoded stimuli. These observations can be\nformalized by the well developed branch of mathematical order theory called Formal Concept Anal-\nysis (FCA) [8, 9]. In FCA, data from a binary relation (or formal context) is represented as a concept\nlattice. Each concept has a set of formal objects as an extent and a set of formal attributes as an in-\ntent. In our application, the stimuli are the formal objects, and the neurons are the formal attributes.\nThe FCA approach exploits the duality of extensional and intensional descriptions and allows to\nvisually explore the data in lattice diagrams. FCA has shown to be useful for data exploration and\nknowledge discovery in numerous applications in a variety of \ufb01elds [10, 11].\nWe give a short introduction to FCA in section 2 and demonstrate how the sparseness (or denseness)\nof the neural code affects the structure of the concept lattice in section 3. Section 4 describes the\ngenerative classi\ufb01er model which we use to build the formal context from the responses of neurons\nin the high-level visual cortex of monkeys. Finally, we discuss the concept lattices so obtained in\nsection 5.\n\n2 Formal Concept Analysis\n\nCentral to FCA[9] is the notion of the formal context K := (G, M, I), which is comprised of a set\nof formal objects G, a set of formal attributes M and a binary relation I \u2286 G\u00d7M between members\nof G and M. In our application, the members of G are visual stimuli, whereas the members of M\nare the neurons. If neuron m \u2208 M responds when stimulus g \u2208 G is presented, then we write\n(g, m) \u2208 I or gIm. It is customary to represent the context as a cross table, where the row(column)\nheadings are the object(attribute) names. For each pair (g, m) \u2208 I, the corresponding cell in the\ncross table has an \u201dx\u201d. Table 1, left, shows a simple example context.\n\nconcept\n\nextent (stimuli)\n\nintent (neurons)\n\nn1\n\nn2\nmonkeyFace \u00d7 \u00d7\n\u00d7\nmonkeyHand\nhumanFace\n\n\u00d7\n\nspider\n\nn3\n\n\u00d7\n\n0\n1\n2\n3\n4\n5\n\nALL\nspider\n\nhumanFace monkeyFace\nmonkeyFace monkeyHand\n\nmonkeyFace\n\nNONE\n\nNONE\n\nn3\nn1\nn2\n\nn1 n2\nALL\n\nTable 1: Left: a simple example context, represented as a cross-table. The objects (rows) are 4 visual\nstimuli, the attributes (columns) are 3 (hypothetical) neurons n1,n2,n3. An \u201dx\u201d in a cell indicates that\na stimulus elicited a response from the corresponding neuron. Right: the concepts of this context.\nConcepts are lectically ordered [9]. Colors correspond to \ufb01g.1.\nDe\ufb01ne the prime operator for subsets A \u2286 G as A(cid:48) = {m \u2208 M|\u2200g \u2208 A : gIm} i.e. A(cid:48) is the set of\nall attributes shared by the objects in A. Likewise, for B \u2286 M de\ufb01ne B(cid:48) = {g \u2208 G|\u2200m \u2208 B : gIm}\ni.e. B(cid:48) is the set of all objects having all attributes in B.\nDe\ufb01nition 2.1 [9] A formal concept of the context K is a pair (A, B) with A \u2286 G, B \u2286 M such\nthat A(cid:48) = B and B(cid:48) = A. A is called the extent and B is the intent of the concept (A, B). IB(K)\ndenotes the set of all concepts of the context K.\n\nIn other words, given the relation I, (A, B) is a concept if A determines B and vice versa. A and B\nare sometimes called closed subsets of G and M with respect to I. Table 1, right, lists all concepts\nof the context in table 1, left. One can visualize the de\ufb01ning property of a concept as follows: if\n(A, B) is a concept, reorder the rows and columns of the cross table such that all objects in A are in\nadjacent rows, and all attributes in B are in adjacent columns. The cells corresponding to all g \u2208 A\nand m \u2208 B then form a rectangular block of \u201dx\u201ds with no empty spaces in between. In the example\nabove, this can be seen (without reordering rows and columns) for concepts 1,3,4. For a graphical\nrepresentation of the relationships between concepts, one de\ufb01nes an order IB(K):\n\nDe\ufb01nition 2.2 [9] If (A1, B1) and (A2, B2) are concepts of a context, (A1, B1) is a subconcept of\n(A2, B2) if A1 \u2286 A2 (which is equivalent to B1 \u2287 B2). In this case, (A2, B2) is a superconcept of\n(A1, B1) and we write (A1, B1) \u2264 (A2, B2). The relation \u2264 is called the order of the concepts.\n\n\fIt can be shown [8, 9] that IB(K) and the concept order form a complete lattice. The concept lattice\nof the context in table 1, with full and reduced labeling, is shown in \ufb01g.1. Full labeling means that\na concept node is depicted with its full extent and intent. A reduced labeled concept lattice shows\nan object only in the smallest (w.r.t. the concept order of de\ufb01nition 2.2) concept of whose extent\nthe object is a member. This concept is called the object concept, or the concept that introduces\nthe object. Likewise, an attribute is shown only in the largest concept of whose intent the attribute\nis a member, the attribute concept, which introduces the attribute. The closedness of extents and\nintents has an important consequence for neuroscienti\ufb01c applications. Adding attributes to M (e.g.\nresponses of additional neurons) will very probably grow IB(K). However, the original concepts\nwill be embedded as a substructure in the larger lattice, with their ordering relationships preserved.\n\nFigure 1: Concept lattice computed from the context in table 1. Each node is a concept, arrows\nrepresent superconcept relation, i.e. an arrow from X to Y reads: X is a superconcept of Y . Colors\ncorrespond to table 1, right. The number in the leftmost compartment is the concept number. Middle\ncompartment contains the extent, rightmost compartment the intent. Left: fully labeled concepts,\ni.e. all members of extents and intents are listed in each concept node. Right: reduced labeling.\nAn object/attribute is only listed in the extent/intent of the smallest/largest concept that contains it.\nReduced labeling is very useful for drawing large concept lattices.\n\nThe lattice diagrams make the ordering relationship between the concepts graphically explicit: con-\ncept 3 contains all \u201dmonkey-related\u201d stimuli, concept 2 encompasses all \u201dfaces\u201d. They have a com-\nmon child, concept 4, which is the \u201dmonkeyFace\u201d concept. The \u201dspider\u201d concept (concept 1) is\nincomparable to any other concept except the top and the bottom of the lattice. Note that these re-\nlationships arise as a consequence of the (here hypothetical) response behavior of the neurons. We\nwill show (section 5) that the response patterns of real neurons can lead to similarly interpretable\nstructures.\nFrom a decoding perspective, a fully labeled concept shows those stimuli that have activated at least\nthe set of neurons in the intent. In contrast, the stimuli associated with a concept in reduced labeling\nwill activate the set of neurons in the intent, but no others. The fully labeled concepts show stimuli\nencoded by activity of the active neurons of the concept without knowledge of the \ufb01ring state of the\nother neurons. Reduced labels, on the other hand show those stimuli that elicited a response only\nfrom the neurons in the intent.\n\n3 Concept lattices of local, sparse and dense codes\n\nOne feature of neural codes which has attracted a considerable amount of interest is its sparseness.\nIn the case of a binary neural code, the sparseness of a codeword is inversely related to the fraction of\nactive neurons. The average sparseness across all codewords is the sparseness of the code [12, 13].\nSparse codes, i.e. codes where this fraction is low, are found interesting for a variety of reasons:\nthey offer a good compromise between encoding capacity, ease of decoding and robustness [14],\nthey seem to be employed in the mammalian visual processing system [15] and they are well suited\nto representing the visual environment we live in [15, 16]. It is also possible to de\ufb01ne sparseness\nfor graded or even continuous-valued responses (see e.g. [17, 4, 13]). To study what structural\n\n\feffects different levels of sparseness would have on a neural code, we generated random codes, i.e.\neach of 10 stimuli was associated with randomly drawn responses of 10 neurons, subject to the\nconstraints that the code be perfectly decodable and that the sparseness of each codeword was equal\nto the sparseness of the code. Fig.2 shows the contexts (represented as cross-tables) and the concept\nlattices of a local code (activity ratio 0.1), a sparse code (activity ratio 0.2) and a dense code (activity\nratio 0.5).\n\nneuron\n\nx\n\nx\n\nx\n\nx\n\nx\n\nx\n\nx\n\nx\n\nx\n\ns\nt\ni\nm\nu x\nl\nu\ns\n\nx\n\nx\n\nx\n\nneuron\n\nx\n\nx\n\nx\n\nx x\n\nx\nx\n\nx\n\ns\nt\ni\nm\nu\nl\nu\ns x\n\nx\n\nx x\nx\n\nx\n\nx\n\nx\n\nx x\n\nx\nneuron\nx\nx x\ns x x x x\nt x\ni\nm\nu x x\nx\nl\nu x x\ns x x\n\nx x x x\nx\n\nx\nx x\n\nx x\nx\n\nx x\n\nx\n\nx\n\nx x x\n\nx\nx x\n\nx x x x\n\nx\n\nx\n\nx\nx\nx x x\nx\n\nFigure 2: Contexts (represented as cross-tables) and concept lattices for a local, sparse and dense\nrandom neural code. Each context was built out of the responses of 10 (hypothetical) neurons\nto 10 stimuli. Each node represents a concept, the left(right) compartment contains the number\nof introduced stimuli(neurons). In a local code, the response patters to different stimuli have no\noverlapping activations, hence the lattice representing this code is an antichain with top and bottom\nelement added. Each concept in the antichain introduces (at least) one stimulus and (at least) one\nneuron. In contrast, a dense code results in a lot of concepts which introduce neither a stimulus nor\na neuron. The lattice of the dense code is also substantially longer than that of the sparse and local\ncodes.\n\nThe most obvious differences between the lattices is the total number of concepts. A dense code,\neven for a small number of stimuli, will give rise to a lot of concepts, because the neuron sets repre-\nsenting the stimuli are very probably going to have non-empty intersections. These intersections are\npotentially the intents of concepts which are larger than those concepts that introduce the stimuli.\nHence, the latter are found towards the bottom of the lattice. This implies that they have large intents,\nwhich is of course a consequence of the density of the code. Determining these intents thus requires\nthe observation of a large number of neurons, which is unappealing from a decoding perspective.\nThe local code does not have this drawback, but is hampered by a small encoding capacity (maximal\nnumber of concepts with non-empty extents): the concept lattice in \ufb01g.2 is the largest one which can\nbe constructed for a local code comprised of 10 binary neurons. Which of the above structures is\nmost appropriate depends on the conceptual structure of environment to be encoded.\n\n4 Building a formal context from responses of high-level visual neurons\n\nTo explore whether FCA is a suitable tool for interpreting real neural codes, we constructed formal\ncontexts from the responses of high-level visual cortical cells in area STSa (part of the temporal lobe)\nof monkeys. Characterizing the responses of these cells is a dif\ufb01cult task. They exhibit complex\n\n\fnonlinearities and invariances which make it impossible to apply linear techniques, such as reverse\ncorrelation [18, 19, 20]. The concept lattice obtained by FCA might enable us to display and browse\nthese invariances: if the response of a subset of cells indicates the presence of an invariant feature\nin a stimulus, then all stimuli having this feature should form the extent of a concept whose intent\nis given by the responding cells, much like the \u201dmonkey\u201d and \u201dface\u201d concepts in the example in\nsection 2.\n\n4.1 Physiological data\n\nThe data were obtained through [21], where the experimental details can be found. Brie\ufb02y, spike\ntrains were obtained from neurons within the upper and lower banks of the superior temporal sulcus\n(STSa) via standard extracellular recording techniques [22] from an awake and behaving monkey\n(Macaca mulatta) performing a \ufb01xation task. This area contains cells which are responsive to faces.\nThe recorded \ufb01ring patters were turned into distinct samples, each of which contained the spikes\nfrom \u2212300 ms before to 600 ms after the stimulus onset with a temporal resolution of 1 ms. The\nstimulus set consisted of 1704 images, containing color and black and white views of human and\nmonkey head and body, animals, fruits, natural outdoor scenes, abstract drawings and cartoons.\nStimuli were presented for 55ms each without inter-stimulus gaps in random sequences. While this\nrapid serial visual presentation (RSVP) paradigm complicates the task of extracting stimulus-related\ninformation from the spiketrains, it has the advantage of allowing for the testing of a large number\nof stimuli. A given cell was tested on a subset of 600 or 1200 of these stimuli, each stimulus was\npresented between 1-15 times.\n\n4.2 Bayesian thresholding\n\nBefore we can apply FCA, we need to extract a binary attribute from the raw spiketrains. While\nFCA can also deal with many-valued attributes, see [23, 9], we will employ binary thresholding as a\nstarting point. Moreover, when time windows are limited (e.g. in the RSVP condition) it is usually\nimpossible to extract more than 1 bit of stimulus identity-related information from a spiketrain per\nstimulus [24]. We do not suggest that real neurons have a binary activation function. We are merely\nconcerned with \ufb01nding a maximally informative response binarization, to allow for the construction\nof meaningful concepts. We do this by Bayesian thresholding, as detailed in appendix A. This\nprocedure also avails us of a null hypothesis H0 =\u201dthe responses contain no information about the\nstimuli\u201d.\n\n4.3 Cell selection\n\nThe experimental data consisted of recordings from 26 cells. To minimize the risk that the com-\nputed neural responses were a result of random \ufb02uctuations, we excluded a cell if 1.) H0 was more\nprobable than 10\u22126 or 2.)\nthe posterior standard deviations of the counting window parameters\nwere larger than 20ms, indicating large uncertainties about the response timing. Cells which did\nnot respond above the threshold included all cells excluded by the above criteria (except one). Fur-\nthermore, since not all cells were tested on all stimuli, we also had to select pairs of subsets of cells\nand stimuli such that all cells in a pair were tested on all stimuli. Incidentally, this selection can also\nbe accomplished with FCA, by determining the concepts of a context with gJm =\u201dstimulus g was\ntested on cell m\u201d and selecting those with a large number of stimuli \u00d7 number of cells. Two of these\ncell and stimulus subset pairs (\u201dA\u201d, containing 364 stimuli and 13 cells, and \u201dB\u201d, containing 600\nstimuli, 12 cells) were selected for further analysis.\n\n5 Results\n\nTo analyze the neural code, the thresholded neural resposes were used to build stimulus-by-cell-\nresponse contexts. We performed FCA on these with COLIBRICONCEPTS1, created stimulus image\nmontages and plotted the lattices2. The complete concept lattices were too large to display on a page.\nGraphs of lattices A and B with reduced labeling on the stimuli are included in the supplementary\n\n1see http://code.google.com/p/colibri-concepts/\n2with IMAGEMAGICK, http://www.imagemagick.org and GRAPHVIZ, http://www.graphviz.org\n\n\fA\n\nB\n\nFigure 3: A: a subgraph of lattice A with reduced labeling on the stimuli, i.e. stimuli are only shown\nin their object concepts. The \u2205 indicates that an extent is the intersection of its superconcepts\u2019\nextents, i.e. no new stimuli were introduced by this concept. All cells forming this part of the\nconcept lattice were responsive to faces. B: a subgraph of lattice B, fully labeled. The concepts on\nthe right side are not exclusively \u201dface\u201d concepts, but most members of their extents have something\n\u201droundish\u201d about them.\n\nlabeling,\n\nis also a part of\n\nthe supplementary material\n\nmaterial (\ufb01les latticeA neuroFCA.pdf and latticeB neuroFCA.pdf). In these graphs,\nthe top of the frame around each concept image contains the concept number and the list of cells in\nthe intent.\nFig.3, A shows a subgraph from lattice A, which exclusively contained \u201dface\u201d concepts.\nThis subgraph, with full\n(\ufb01le\nfaceSubgraphLatticeA neuroFCA.pdf). The top concepts introduce human and cartoon\nfaces, i.e. their extents are consist of general \u201dface\u201d images, while their intents are small (3 cells).\nIn contrast, the lower concepts introduce mostly single monkey faces, with the bottom concepts\nhaving an intent of 7 cells. We may interpret this as an indication that the neural code has a higher\n\u201dresolution\u201d for faces of conspeci\ufb01cs than for faces in general, i.e. other monkeys are represented\nin greater detail in a monkey\u2019s brain than humans or cartoons. This feature can be observed in most\nlattices we generated.\nFig.3, B shows a subgraph from lattice B with full labeling. The concepts in the left half of the\ngraph are face concepts, whereas the extents of the concepts in the right half also contain a number\nof non-face stimuli. Most of the latter have something \u201droundish\u201d about them. The bottom concept,\nbeing subordinate to both the \u201dround\u201d and the \u201dface\u201d concepts, encompasses stimuli with both char-\nacteristics, which points towards a product-of-experts encoding [25]. This example also highlights\nanother advantage of FCA over standard hierarchical analysis techniques, e.g. hierarchical cluster-\ning: it does not impose a tree structure when the data do not support it (a shortcoming of the analysis\nin [26]).\nFor preliminary validation, we experimented with stimulus shuf\ufb02ing (i.e. randomly assigning stimuli\nto the recorded responses) to determine whether the found concepts are indeed meaningful. This\nprocedure leaves the lattice structure intact, but mixes up the extents. A \u2019naive\u2019 observer was then\nno longer able to label the concepts (as in \ufb01g.3, \u2019round\u2019, \u2019face\u2019 or \u2019conspeci\ufb01cs\u2019). Evidence of\nconcept stability was obtained by trying different binarization thresholds: as stated in appendix A,\nwe used a threshold probability of 0.5. This threshold can be raised up to 0.7 without losing any\nof the conceptual structures described in \ufb01g.3, although some of the stimuli migrate upwards in the\nlattice.\n\n\f6 Conclusion\n\nWe demonstrated the potential usefulness of FCA for the exploration and interpretation of neural\ncodes. This technique is feasible even for high-level visual codes, where linear decoding methods\n[19, 20] fail, and it provides qualitative information about the structure of the code which goes\nbeyond stimulus label decoding [4]. Clearly, this application of FCA is still in its infancy. It would\nbe very interesting to repeat the analysis presented here on data obtained from simultaneous multi-\ncell recordings, to elucidate whether the conceptual structures derived by FCA are used for decoding\nby real brains. On a larger scale than single neurons, FCA could also be employed to study the\nrelationships in fMRI data [27].\nAcknowledgment D. Endres was supported by MRC fellowship G0501319.\n\nReferences\n[1] A. P. Georgopoulos, A. B. Schwartz, and R. E. Kettner. Neuronal population coding of move-\n\nment direction. Science, 233(4771):1416\u20131419, 1986.\n\n[2] P F\u00a8oldi\u00b4ak. The \u2019Ideal Homunculus\u2019: Decoding neural population responses by Bayesian infer-\n\nence. Perception, 22 suppl:43, 1993.\n\n[3] MW Oram, P F\u00a8oldi\u00b4ak, DI Perrett, and F Sengpiel. The \u2019Ideal Homunculus\u2019: decoding neural\n\npopulation signals. Trends In Neurosciences, 21:259\u2013265, June 1998.\n\n[4] R. Q. Quiroga, L. Reddy, C. Koch, and I. Fried. Decoding Visual Inputs From Multiple Neu-\n\nrons in the Human Temporal Lobe. J Neurophysiol, 98(4):1997\u20132007, 2007.\n\n[5] OR Duda, PE Hart, and DG Stork. Pattern classi\ufb01cation. John Wiley & Sons, New York,\n\nChichester, 2001.\n\n[6] T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & Sons, New\n\nYork, 1991.\n[7] P F\u00a8oldi\u00b4ak.\n\nSparse neural representation for semantic indexing.\n\nof the European Society of Cognitive Psychology (ESCOP-2003), 2003.\nandrews.ac.uk/\u223cpf2/escopill2.pdf.\n\nIn XIII Conference\nhttp://www.st-\n\n[8] R. Wille. Restructuring lattice theory: an approach based on hierarchies of concepts. In I. Rival,\n\neditor, Ordered sets, pages 445\u2013470. Reidel, Dordrecht-Boston, 1982.\n\n[9] Bernhard Ganter and Rudolf Wille. Formal Concept Analysis: Mathematical foundations.\n\nSpringer, 1999.\n\n[10] B. Ganter, G. Stumme, and R. Wille, editors. Formal Concept Analysis, Foundations and\n\nApplications, volume 3626 of Lecture Notes in Computer Science. Springer, 2005.\n\n[11] U. Priss. Formal concept analysis in information science. Annual Review of Information\n\nScience and Technology, 40:521\u2013543, 2006.\n\n[12] P F\u00a8oldi\u00b4ak. Sparse coding in the primate cortex. In Michael A Arbib, editor, The Handbook of\n\nBrain Theory and Neural Networks, pages 1064\u20131068. MIT Press, second edition, 2002.\n\n[13] P F\u00a8oldi\u00b4ak and D Endres.\n\nSparse coding.\n\nhttp://www.scholarpedia.org/article/Sparse coding.\n\nScholarpedia,\n\n3(1):2984,\n\n2008.\n\n[14] P F\u00a8oldi\u00b4ak. Forming sparse representations by local anti-Hebbian learning. Biological Cyber-\n\nnetics, 64:165\u2013170, 1990.\n\n[15] B. A Olshausen, D. J Field, and A Pelah. Sparse coding with an overcomplete basis set: a\n\nstrategy employed by V1. Vision Res., 37(23):3311\u20133325, 1997.\n\n[16] Eero P Simoncelli and Bruno A Olshausen. Natural image statistics and neural representation.\n\nAnnual Review of Neuroscience, 24:1193\u20131216, 2001.\n\n[17] ET Rolls and A Treves. The relative advantages of sparse versus distributed encoding for\n\nneuronal networks in the brain. Network, 1:407\u2013421, 1990.\n\n[18] P Dayan and LF Abbott. Theoretical Neuroscience. MIT Press, London, Cambridge, 2001.\n[19] J.P. Jones and L. A. Palmer. An evaluation of the two-dimensional Gabor \ufb01lter model of simple\n\nreceptive \ufb01elds in cat striate cortex. Journal of Neurophysiology, 58(6):1233\u20131258, 1987.\n\n\f[20] D. L. Ringach. Spatial structure and symmetry of simple-cell receptive \ufb01elds in macaque\n\nprimary visual cortex. Journal of Neurophysiology, 88:455\u2013463, 2002.\n\n[21] P F\u00a8oldi\u00b4ak, D Xiao, C Keysers, R Edwards, and DI Perrett. Rapid serial visual presentation\nfor the determination of neural selectivity in area STSa. Progress in Brain Research, pages\n107\u2013116, 2004.\n\n[22] M. W. Oram and D. I. Perrett. Time course of neural responses discriminating different views\n\nof the face and head. Journal of Neurophysiology, 68(1):70\u201384, 1992.\n\n[23] R. Wille and F. Lehmann. A triadic approach to formal concept analysis. In G. Ellis, R. Levin-\nson, W. Rich, and J. F. Sowa, editors, Conceptual structures: applications, implementation and\ntheory, pages 32\u201343. Springer, Berlin-Heidelberg-New York, 1995.\n\n[24] D. Endres. Bayesian and Information-Theoretic Tools for Neuroscience. PhD thesis, School\n\nof Psychology, University of St. Andrews, U.K., 2006. http://hdl.handle.net/10023/162.\n\n[25] GE Hinton. Products of experts. In Ninth International Conference on Arti\ufb01cial Neural Net-\n\nworks ICANN 99, number 470 in ICANN, 1999.\n\n[26] R Kiani, H Esteky, K Mirpour, and K Tanaka. Object category structure in response pat-\nterns of neuronal population in monkey inferior temporal cortex. Journal of Neurophysiology,\n97(6):4296\u20134309, April 2007.\n\n[27] K. N. Kay, T. Naselaris, R. J. Prenger, and J. L. Gallant.\n\nIdentifying natural images from\n\nhuman brain activity. Nature, 452:352\u2013255, 2008. http://dx.doi.org/10.1038/nature06713.\n\n[28] D. Endres and P. F\u00a8oldi\u00b4ak. Exact Bayesian bin classi\ufb01cation: a fast alternative to bayesian\nclassi\ufb01cation and its application to neural response analysis. Journal of Computational Neuro-\nscience, 24(1):24\u201335, 2008. DOI: 10.1007/s10827-007-0039-5.\n\nA Method of Bayesian thresholding\n\nA standard way of obtaining binary responses from neurons is thresholding the spike count within\na certain time window. This is a relatively straightforward task, if the stimuli are presented well\nseparated in time and a lot of trials per stimulus are available. Then latencies and response offsets\nare often clearly discernible and thus choosing the time window is not too dif\ufb01cult. However, under\nRSVP conditions with few trials per stimulus, response separation becomes more tricky, as the\nresponses to subsequent stimuli will tend to follow each other without an intermediate return to\nbaseline activity. Moreover, neural resposes tend to be rather noisy. We will therefore employ a\nsimpli\ufb01ed version of the generative Bayesian Bin classi\ufb01cation algorithm (BBCa) [28], which was\nshown to perform well on RSVP data [24].\nBBCa was designed for the purpose of inferring stimulus labels g from a continuous-valued, scalar\nmeasure z of a neural response. The range of z is divided into a number of contiguous bins. Within\neach bin, the observation model for the g is a Bernoulli scheme with a Dirichlet prior over its param-\neters. It is shown in [28] that one can iterate/integrate over all possible bin boundary con\ufb01gurations\nef\ufb01ciently, thus making exact Bayesian inference feasible. We make two simpli\ufb01cations to BBCa:\n1) z is discrete, because we are counting spikes and 2) we use models with only 1 bin boundary in\nthe range of z. The bin membership of a given neural response can then serve as the binary attribute\nrequired for FCA, since BBCa weighs bin con\ufb01gurations by their classi\ufb01cation (i.e. stimulus label\ndecoding) performance. We proceed in a straight Bayesian fashion: since the bin membership is the\nonly variable we are interested in, all other parameters (counting window size and position, class\nmembership probabilities, bin boundaries) are marginalized. This minimizes the risk of spurious re-\nsults due to \u201dcontrived\u201d information (i.e. choices of parameters) made at some stage of the inference\nprocess. Afterwards, the probability that the response belongs to the upper bin is thresholded at a\nprobability of 0.5. BBCa can also be used for model comparison. Running the algorithm with no bin\nboundaries in the range of z effectively yields the probability of the data given the \u201dnull hypothesis\u201d\nH0: z does not contain any information about g. We can then compare it against the alternative\nhypothesis described above (i.e.\nthe information which bin z is in tells us something about g) to\ndetermine whether the cell has responded at all.\n\n\f", "award": [], "sourceid": 117, "authors": [{"given_name": "Dominik", "family_name": "Endres", "institution": null}, {"given_name": "Peter", "family_name": "Foldiak", "institution": null}]}