{"title": "Optimal Neural Spike Classification", "book": "Neural Information Processing Systems", "page_first": 95, "page_last": 102, "abstract": null, "full_text": "95 \n\nOPTIMAL NEURAL SPIKE CLASSIFICATION \n\nAmir F. Atiya(*) and James M. Bower(**) \n\n(*) Dept. of Electrical Engineering \n\n(**) Division of Biology \n\nCalifornia Institute of Technology \n\nCa 91125 \n\nAbstract \n\nBeing able to record the electrical activities of a number of neurons simultaneously is likely \nto be important in the study of the functional organization of networks of real neurons. Using \none extracellular microelectrode to record from several neurons is one approach to studying \nthe response properties of sets of adjacent and therefore likely related neurons. However, to \ndo this, it is necessary to correctly classify the signals generated by these different neurons. \nThis paper considers this problem of classifying the signals in such an extracellular recording, \nbased upon their shapes, and specifically considers the classification of signals in the case when \nspikes overlap temporally. \n\nIntroduction \n\nHow single neurons in a network of neurons interact when processing information is likely \nto be a fundamental question central to understanding how real neural networks compute. \nIn the mammalian nervous system we know that spatially adjacent neurons are, in general, \nmore likely to interact, as well as receive common inputs. Thus neurobiologists are interested \nin devising techniques that allow adjacent groups of neurons to be sampled simultaneously. \nUnfortunately, the small scale of real neural networks makes inserting one recording electrode \nper cell impractical. Therefore, one is forced to use single electrodes designed to sample neu(cid:173)\nral signals evoked by several cells at once. While this approach provides the multi-neuron \nrecordings being sought, it also presents a rather serious waveform classification problem be(cid:173)\ncause the actual temporal sequence of action potentials in each individual neuron must be \ndeciphered. This paper describes a method for classifying the activities of several individual \nneurons recorded simultaneously using a single electrode. \n\nDescription of the Problem \n\nOver the last two decades considerable attention 1-8 has been devoted to the problem of \nclassification of action potentials in multi-neuron recordings. These action potentials (also \nreferred to as \"spikes\") are the extracellularly recorded signal produced by a single neuron \nwhen it is passing information to other neurons (Fig. 1). Fortunately, spikes recorded from the \nsame cell are more or less similar in shape, while spikes coming from different neurons usually \nhave somewhat different shapes, depending on the neuron type, electrode characteristics, the \ndistance between the electrode and the neuron, and the intervening medium. Fig. 1 illustrates \nsome representative variations in spike shapes. It is our objective to detect and classify different \nspikes based on their shapes. However, relying entirely on the shape of the spikes presents \ndifficulties. For example spikes from different neurons can overlap temporally producing novel \nwaveforms (see Fig. 2 for an example of an overlap). To deal with these overlaps, one has first \nto detect the occurrence of an overlap, and then estimate the constituent spikes. Unfortunately, \nonly a few of the available spike separation algorithms consider these events, even though they \nare potentially very important in understanding neural networks. Those few tend to rely \n\n\u00a9 American Institute of Physics 1988 \n\n\f96 \n\non heuristic rules and subtractive methods to resolve overlap cases. No currently published \nmethod we are aware of attempts to use knowledge of the likelihood of overlap events for \ndetecting them, which is at the basis of the method we will describe. \n\nAn example of a multi-neuron recording \n\nFig. 1 \n\noverlapping spikes \n\nAn example of a temporal overlap of action potentials \n\nFig. 2 \n\nGeneral Approach \n\nThe first step in classifying neural waveforms is obviously to identify the typical spike \nshapes occurring in a particular recording. To do this we have applied a learning algorithm \non the beginning portion of the recording, which in an unsupervised fashion (i.e. without the \nintervention of a human operator) estimates the shapes. After the learning stage we have \nthe classification stage, which is applied on the remaining portion of the recording. A new \nclassification method is proposed, which gives minimum probability of error, even in case of the \noccurrence of overlapping spikes. Both the learning and the classification algorithms require \na preprocessing step to detect the position of the spike candidate in the data record. \n\nDetection: For the first task of detection most researchers use a simple level detecting \nalgorithm, that signals a spike when recorded voltage levels cross a certain voltage threshold. \nHowever, variations in recording position due to natural brain movements during recording \n(e.g. respiration) can cause changes in relative height of the positive to the negative peak. \nThus, a level detector (using either a positive or a negative threshold) can miss some spikes. \nAlternatively, we have chosen to detect an event by sliding a window of fixed length until a \ntime when the peak to peak value within the window exceeds a certain threshold. \n\nLearning: Learning is performed on the beginning portion of the sampled data using \nthe Isodata clustering algorithm 9. The task is to estimate the number of neurons n whose \nspikes are represented in the waveform and learn the different shapes of the spikes of the \nvarious neurons. For that purpose we apply the clustering algorithm choosing only one feature \n\n\f97 \n\nfrom the spike, the peak to peak value which we have found to be quite an effective feature. \nNote that using the peak to peak value in the learning stage does not necessitate using it for \nclassification (one might need additional or different features, especially for tackling the case \nof spike overlap) . \n\nThe Optimal Olassification Rule: Once we have identified the number of different events \npresent, the classification stage is concerned with estimating the identities of the spikes in the \nrecording, based on the typical spike shapes obtained in the learning stage. In our classification \nscheme we make use of the information given by the shape of the detected spike as well \nas the firing rates of the different neurons. Although the shape plays in general the most \nimportant role in the classification, the rates become a more significant factor when dealing \nwith overlapping events. This is because in general overlap is considerably less frequent than \nsingle spikes. The shape information is given by a set of features extracted from the waveform. \nLet x be the feature vector of the detected spike (e.g. the samples of the spike waveform). Let \nN I , ... , Nn represent the different neurons. The detection algorithm tells us only that at least \none spike occurred in the narrow interval (t - TI,t + T2) (= say 1) where t is the instant of \nthe peak of the detected spike, TI and T2 are constants chosen subjectively according to the \nsmallest possible time separation between two consecutive spikes, identifiable as two separate \n(nonoverlapping) spikes. By definition, if more than one spike occurs in the interval I, then \nwe have an overlap. As a matter of convention, the instant of the occurrence of a spike i. .. \ntaken to be that of the spike peak. For simplicity, we will consider the case of two possibly \noverlapping spikes, though the method can be extended easily to more. The classification rule \nwhich results in minimum probability of error is the one which chooses the neuron (or pair of \nneurons in case of overlap) which has the maximum likelihood. We have therefore to compare \nthe Pi'S and the P,/s, defined as \n\n~ = P(Ni fired in Ilx, A), \n\nP,j = P(N, and Nj fired in Ilx, A), \n\ni = 1, ... ,n \nl,j=I, ... ,n, \n\nj'1, ... , >'n of the different neurons Nl, ... , Nn respectively. Then \nthe probability of a spike coming from neuron Ni in an interval of duration dt is simply >'idt. \nHence \n\nIn the second scheme we do not use any previous knowledge except for the total firing rate (of \nall neurons), say a. Then \n\nAlthough the second scheme does not use as much of the information about the firing \npattern as the first scheme does, it has the advantage of obtaining and using a more reliable \nestimate of the firing rate, because in general the overall firing rate changes less with time than \nthe individual rates and because the estimate of a does not depend on previous classification \nresults. However, it is useful mostly when the firing rates of the different neurons do not vary \nmuch, otherwise the firt scheme is preferred. \n\nIn real recording situations, sometimes one encounters voltage signals which are much \ndifferent than any of the previously learned typical spike shapes or their pairwise overlaps. \nThis can happen for example due to a falsely detected noise event, a spike from a class not \nencountered in the learning stage, or to the overlap of three or more spikes. To cope with \nthese cases we use the reject option. This means that we refuse to classify the detected spike \nbecause of the unlikeliness of the assumed event A. The reject option is therefore employed \nwhenever P(Alx) is smaller than a certain threshold. We know that \n\nP(Alx) = J(A,x)/[J(A,x) + J(AC ,x)] \n\nwhere AC is the complement of the event A. The density f(AC,x) can be approximated as \nuniform (over the possible values of x) because a large variety of cases are covered by the event \nAC. It follows that one can just compare J(A,x) to a threshold. Hence the decision strategy \nbecomes finally: Reject if the sum of the likelihood functions is less than a threshold. Otherwise \nchoose the neuron (or pair of neurons) corresponding to the largest likelihood functions. Note \nthat the sum of the likelihood functions equals J(A,x) (refer to Appendix). \n\nNow, let us evaluate the integrals in (1). Overlapping spikes are assumed to add linearly. \nSince we intend to handle the overlap case, we have to use a set of features Xm which obeys \nthe following. Given the features of two of the waveforms, then one can compute those of their \noverlap. A good such candidate is the set of the samples of the spike (or possibly also just \npart of the samples). The added noise, partly thermal noise from the electrode and partly \ndue to firings from distant neurons, can usually be approximated as white Gaussian. Let the \nvariance be a 2 \u2022 The integrals in the likelihood functions can be approximated as summations \n(note in fact that we have samples available, not a continuous waveform). Let yi represent the \ntypical feature vector (template) associated with neuron N i , with the mth component being \ny;\". Then \n\nJ(xIB/(kI), Bj(kd) = (21r)~/2aM exp[ - 2~2 '~l (x m - y!n-k 1 \n\nM \n\n- y~_k2)2] \n\n\fwhere Xm is the mth component of x, and M is the dimension of x. This leads to the following \nlikelihood functions \n\n99 \n\nL~ = f(Bd k)) L exp[- 2~2 :L (xm - Y:n_kJ2] \n\nM~ \n\nM \n\nkl=-M 1 \n\nm=l \n\nLL\u00b7 = f(B,(k))f(Bj(k)) L L exp[- 2~2 L (x m - y!n-k 1 - y~_kl)2] \n\nM \n\nM) \n\nMl \n\nwhere k is the spike instant, and the interval from -Ml to M2 corresponds to the interval I \ndefined at the beginning of the Section. \n\nkl=-Mlkl=-Ml \n\nm=l \n\nImplementation \n\nThe techniques we have just described were tested in the following way. For the first \nexperiment we identified two spike classes in a recording from the rat cerebellum. A signal \nis created, composed of a number of spikes from the two classes at random instants, plus \nnoise. To make the situation as realistic as possible, the added noise is taken from idle periods \n(i.e. non-spiking) of a real recording. The reason for using such an artificially generated \nsignal is to be able to know the class identities of the spikes, in order to test our approach \nquantitatively. We implement the detection and classification techniques on the obtained \nsignal, with various values of noise amplitude. In our case the ratio of the peak to peak values \nof the templates turns out to be 1.375. Also, the spike rate of one of the clases is twice that of \nthe other class. Fig.3a shows the results with applying the first scheme (i.e. using Eq. 3). The \noverall percentage correct classification for all spikes (solid curve) and the percentage correct \nclassification for overlapping spikes (dashed curve) are plotted versus the standard deviation \nof the noise (]\" normalized with respect to the peak h of the large template. Notice that the \noverall classification accuracy is near 100% for (]\" I h less than 0.15, which is actually the range \nof noise amplitudes we mostly encountered in our work with real recordings. Observe also \nthe good results for classifying overlapping events. We have applied also the second scheme \n(i.e. using Eq. 4) and obtained similar results. We wish to mention that the thresholds for \ndetection and for the reject option are set up so as to obtain no more than 3% falsely detected \nspikes. \n\nA similar experiment is performed with three waveforms (three classes), where two of the \nwaveforms are the same as those used in the first experiment . The third is the average of \n..\\1 = ..\\2 = ..\\3)' Hence \nthe first two. All the three neurons have the same spike rate (i.e. \nboth classification schemes are equivalent in this case. Fig. 3b shows the overall as well as \nthe sub-category of overlap classification results. One observes that the results are worse than \nthose for the two-class case. This is because the spacings between the templates are in general \nsmaller. Notice also that the accuracy in resolving overlapping events is now tangibly less \nthan the overall accuracy. However, one can say that the results are acceptable in the range \nof (]\" Ih less than 0.1. The following experiment is also performed using the same data. We \nwould like to investigate the importance of the information given by the (overall) firing rate on \nthe problem of classifying overlapping events. In our method the summation in the likelihood \nfunctions for single spikes is multiplied by Otln, while that for overlapping spikes is multiplied \nby (Otln)2 . Usually Otln is considerably less than one. Hence we have a factor which gives less \nweight for overlapping events. Now, consider the case of ignoring completely the information \ngiven by the firing rate and relying solely on sha.pe information. We assume that overlapping \nspikes from any two given classes represent \"new\" class of waveforms and that each of these \noverlap classes has the same rate as that of a single-spike cla.ss. In that case we can obtain \nexpressions for the likelihood functions as consisting just the summations, i.e. free of the rate \n\n\f100 \n\n1 \u2022\u2022 -.. --; \n\nIt._ \n\n\u2022 -; \n51.-\n\u2022 \nC \u2022 d._ \n; ... \n\ne \n\n1 \u2022\u2022 -.. -\nC . ... -. ; ... \n\n51.-\n\nIt._ \n\n\" \n\nI . \n\nI. \n\n....S \n\n... \" \n\n1.111 \n\nI . ISZ \n\n'.l,' \n\nI. \n\nI. \n\n... .. \n\n\u2022\u2022\u2022\u2022\u2022 /t. \n\na \n\n1.1\" \n\nI . ISZ \n\nl .ltI \n\n\u2022 \n\nI \u2022\u2022\u2022 ,1. \n\nb \n\n1 \u2022 \u2022 -.,-\n-; \n... -\u2022 \u2022 ; ... \n~ \n\u00b7 c \n\u2022 51.-\n\nC \n\nIt._ \n\nI. \n\nI. \n\n..... \n\n1.1\" \n\n1.11t \n\n1.I!Il \n\n\u2022\u2022\u2022\u2022\u2022 /t. \n\nc \n\nFig. 3 \n\na) Overall (solid curve) and overlap (dashed curve) \n\nclassification accuracy for a two class case \n\nb) Overall (solid curve) and overlap (dashed curve) \n\nclassification accuracy for a three class case \n\nc)Percent of incorrect classification of single spikes as overlap \n\nsolid curve: scheme utilzing the spike rate \n\ndashed curve: scheme not utilising the spike rate \n\nfactor Olin (refer to Appendix). An experiment is performed using that scheme (on the same \nthree class data). One observes that the method classifies a number of single spikes wrongly \nas overlaps, much more than our original scheme does (see Fig. 3c), especially for the large \nnoise case. On the other hand, the number of overlaps which are classified wrongly as single \nspikes is near zero for both schemes. \n\nFinally, in the last experiment the techniques are implemented on real recordings from the \nrat cerebellum. The recorded signal is band-pass-filtered in the frequency range 300 Hz - 10 \nKHz, then sampled with a rate of 20KHz. For classification, we take 20 samples per spike as \nfeatures. Fig. 4 shows the results ofthe proposed method, using the first scheme (Eq. 3). The \nnumber of neurons whose spikes are represented in the waveform is estimated to be four. The \n\n\f101 \n\ndetection threshold is set up so that spikes which are too small are disregarded, because they \ncome from several neurons far away from the electrode and are hard to distinguish. Notice \nthe overlap of classes 1 and 2, which was detected. We used the second scheme also on the \nsame portion and it gave similar results as those of the first scheme (only one of the spikes is \nclassified differently). Overall, the discrepancies between classifications done by the proposed \nmethod and an experienced human observer were found to be small. \n\n3 \n\n3 \n\n2 \n\n3 \n\n3 \n\n2 \n\n1 \n\n4 1 \n\n1 \n\n3 \n\n1,2 \n\n3 \n\n3 \n\n2 \n\n2 \n\n3 \n\n4 \n\nClassification results for a recording from the rat cerebellum \n\nFig. 4 \n\nConclusion \n\nMany researchers have considered the problem of spike classification in multi-neuron \nrecordings, but only few have tackled the case of spike overlap, which could occur frequently, \nparticularly if the group of neurons under study is stimulated. In this work we propose a \nmethod for spike classification, which can also aid in detecting and classifying overlapping \nspikes. By taking into account the statistical properties of the discharges of the neurons sam(cid:173)\npled, this method minimizes the probability of classification error. The application of the \nmethod to artificial as well as real recordings confirm its effectiveness. \n\nAppendix \n\nConsider first P'i' We can write \n\n\f102 \n\nWe can also obtain \n\nR. = It+T2[t+T2 f(x,AIBI(t d ,Bj (t2))f(B (t ) B \u00b7(t ))dt dt \n\nIJ \n\nt-T1 \n\nt-Tl \n\nf( A) \n\nx, \n\nI 1, J \n\n2 \n\n1 \n\n2 \u00b7 \n\nNow, consider the two events B1(td and B j (t 2 ). In the absense of any information about their \ndependence, we assume that they are independent. We get \n\nWithin the interval I, both f(B/(tt)) and f(B j (t2)) hardly vary because the duration of \nI is very small compared to a typical inter-spike interval. Therefore we get the following \napproximation: \n\nf(B/(td) ~ f(B,(t)) \n\nf(B j (t2)) ~ f(Bj(t)). \n\nThe expression for P\"j becomes \n\nf(B,(t))f(B \u00b7(t)) [t+T2 [t- T 2 \n\nP\"j ~ \n\n( ) J \nf X , A \n\nt-Tl \n\nt-Tl \n\nf(xIB/(td, B j (t2))dt 1 dt 2 \u2022 \n\nNotice that the term A was omitted from the argument of the density inside the integral, \nbecause the occurrence of two spikes at tl and t2El implies the occurrence of A. A similar \nderivation for ~ results in \n\nThe term f(x, A) is common to all the Pils and the Pi's. Hence one can simply compare the \nfollowing likelihood functions: \n\nAeknow ledgement \n\nOur thanks to Dr. Yaser Abu-Mostafa for his assistance with this work. This project was \nsupported by the Caltech Program of Advanced Technology (sponsored by Aerojet,GM,GTE, \nand TRW), and the Joseph Drown Foundation. \n\nReferences \n\nII] M. Abeles and M. Goldstein, Proc. IEEE, 65, pp.762-773, 1977. \n12] G. Dinning and A. Sanderson, IEEE Trans . Bio - M ed. Eng., BME-28, pp. 804-812, \n\n1981. \n\n13] E. D'Hollander and G. Orban, IEEE Trans . Bio-Med. Eng., BME-26, pp. 279-284, 1979. \n14] D. Mishelevich, IEEE Trans. Bio-Med. Eng., BMFr17, pp. 147-150, 1970. \nIs] V. Prochazka and H. Kornhuber, Electroenceph. din. Neurophysiol., 32, pp. 91-93, 1973. \n16] W . Roberts, Bioi. Gybernet., 35, pp. 73-80, 1979. \n17] W. Roberts and D. Hartline, Brain Res., 94, pp. 141-149, 1975. \n18] E. Schmidt, J. Neurosci. Methods, 12, pp. 95-111, 1984. \n19] R. Duda and P. Hart, Pattern Classification and Scene Analysis, John Wiley, 1973. \n\n\f", "award": [], "sourceid": 24, "authors": [{"given_name": "James", "family_name": "Bower", "institution": null}, {"given_name": "Amir", "family_name": "Atiya", "institution": null}]}