{"title": "Neural Networks for Template Matching: Application to Real-Time Classification of the Action Potentials of Real Neurons", "book": "Neural Information Processing Systems", "page_first": 103, "page_last": 113, "abstract": null, "full_text": "NEURAL NETWORKS FOR TEMPLATE MATCHING: \n\nAPPLICATION TO REAL-TIME CLASSIFICATION \n\nOF THE ACTION POTENTIALS OF REAL NEURONS \n\n103 \n\nYiu-fai Wongt, Jashojiban Banikt and James M. Bower! \n\ntDivision of Engineering and Applied Science \n\n!Division of Biology \n\nCalifornia Institute of Technology \n\nPasadena, CA 91125 \n\nABSTRACT \n\nMuch experimental study of real neural networks relies on the proper classification of \n\nextracellulary sampled neural signals (i .e. action potentials) recorded from the brains of ex(cid:173)\nperimental animals. In most neurophysiology laboratories this classification task is simplified \nby limiting investigations to single, electrically well-isolated neurons recorded one at a time. \nHowever, for those interested in sampling the activities of many single neurons simultaneously, \nwaveform classification becomes a serious concern. In this paper we describe and constrast \nthree approaches to this problem each designed not only to recognize isolated neural events, \nbut also to separately classify temporally overlapping events in real time. First we present two \nformulations of waveform classification using a neural network template matching approach. \nThese two formulations are then compared to a simple template matching implementation. \nAnalysis with real neural signals reveals that simple template matching is a better solution to \nthis problem than either neural network approach. \n\nINTRODUCTION \n\nFor many years, neurobiologists have been studying the nervous system by \n\nusing single electrodes to serially sample the electrical activity of single neu(cid:173)\nrons in the brain. However, as physiologists and theorists have become more \naware of the complex, nonlinear dynamics of these networks, it has become \napparent that serial sampling strategies may not provide all the information \nnecessary to understand functional organization. In addition, it will likely be \nnecessary to develop new techniques which sample the activities of multiple \nneurons simultaneouslyl. Over the last several years, we have developed two \ndifferent methods to acquire multineuron data. Our initial design involved \nthe placement of many tiny micro electrodes individually in a tightly packed \npseudo-floating configuration within the brain2. More recently we have been \ndeveloping a more sophisticated approach which utilizes recent advances in \nsilicon technology to fabricate multi-ported silicon based electrodes (Fig. 1) . \nUsing these electrodes we expect to be able to readily record the activity pat(cid:173)\nterns of larger number of neurons. \n\nAs research in multi-single neuron recording techniques continue, it has be(cid:173)\n\ncome very clear that whatever technique is used to acquire neural signals from \nmany brain locations, the technical difficulties associated with sampling, data \ncompressing, storing, analyzing and interpreting these signals largely dwarf the \ndevelopment of the sampling device itself. In this report we specifically consider \nthe need to assure that neural action potentials (also known as \"spikes\") on \neach of many parallel recording channels are correctly classified, which is just \none aspect of the problem of post-processing multi-single neuron data. With \nmore traditional single electrode/single neuron recordings, this task usually in-\n\n\u00a9 American Institute or Physics 1988 \n\n\f104 \n\nvolves passing analog signals through a Schmidt trigger whose output indicates \nthe occurence of an event to a computer, at the same time as it triggers an \noscilloscope sweep of the analog data. The experimenter visually monitors the \noscilloscope to verify the accuracy of the discrimination as a well-discriminated \nsignal from a single neuron will overlap on successive oscilloscope traces (Fig. \nIc). Obviously this approach is impractical when large numbers of channels \nare recorded at the same time. Instead, it is necessary to automate this classifi(cid:173)\ncation procedure. In this paper we will describe and contrast three approaches \nwe have developed to do this. \n\n1~ \n\n~ \n'a. \nE \nIV \n\nTraces \non upper \nlayer \n\nTraces \non lower \nlayer \n\n0 \n\n2 \nUme (msec) \n\n4 \n\nC., \n\n&. \n\nRecording s~e \n75sq.jllT1 \n\nb. \n\nFig. 1. Silicon probe being developed in our lababoratory for multi-single unit recording \nin cerebellar cortex. a) a complete probe; b) surface view of one recording tip; c) several \nsuperimposed neuronal action potentials recorded from such a silicon electrode ill cerebellar \ncortex. \n\nWhile our principal design objective is the assurance that neural waveforms \n\nare adequately discriminated on multiple channels, technically the overall ob(cid:173)\njective of this research project is to sample from as many single neurons as \npossible. Therefore, it is a natural extention of our effort to develop a neural \nwaveform classification scheme robust enough to allow us to distinguish activi(cid:173)\nties arising from more than one neuron per recording site. To do this, however, \nwe now not only have to determine that a particular signal is neural in origin, \nbut also from which of several possible neurons it arose (see Fig. 2a). While \nin general signals from different neurons have different waveforms aiding in \nthe classification, neurons recorded on the same channel firing simultaneously \nor nearly simultaneously will produce novel combination waveforms (Fig. 2b) \nwhich also need to be classified. It is this last complication which particularly \n\n\f105 \n\nbedevils previous efforts to classify neural signals (For review see 5, also see \n3-4). In summary, then, our objective was to design a circuit that would: \n1. distinguish different waveforms even though neuronal discharges tend \n\nto be quite similar in shape (Fig. 2a); \n\n2. recognize the same waveform even though unavoidable movements \nsuch as animal respiration often result in periodic changes in the amplitude \nof a recorded signal by moving the brain relative to the tip of the electrode; \n3. be considerably robust to recording noise which variably corrupts all \n\nneural recordings (Fig. 2); \n\n4. resolve overlapping waveforms, which are likely to be particularly in(cid:173)\n\nteresting events from a neurobiological point of view; \n\n5. provide real-time performance allowing the experimenter to detect \nproblems with discrimination and monitor the progress of the experiment; \n\n6. be implementable in hardware due to the need to classify neural sig(cid:173)\n\nnals on many channels simultaneously. Simply duplicating a software-based \nalgorithm for each channel will not work, but rather, multiple, small, in(cid:173)\ndependent, and programmable hardware devices need to be constructed. \n\nI 50 Jl.V \n\nb. \n\nsignal recorded \n\nc. \n\nelectrode \n\na. \n\nFig. 2. a) Schematic diagram of an electrode recording from two neuronal cell bodies b) An \nactual multi-neuron recording. Note the similarities in the two waveforms and the overlapping \nevent. c) and d) Synthesized data with different noise levels for testing classificat.ion algorithms \n(c : 0.3 NSR ; d: 1.1 NSR) . \n\n\f106 \n\nMETHODS \n\nThe problem of detecting and classifying multiple neural signals on sin(cid:173)\n\ngle voltage records involves two steps. First, the waveforms that are present \nin a particular signal must be identified and the templates be generated; \nsecond, these waveforms must be detected and classified in ongoing data \nrecords. To accomplish the first step we have modified the principal com-\nponent analysis procedure described by Abeles and Goldstein3 to automat(cid:173)\nically extract templates of the distinct waveforms found in an initial sam(cid:173)\nple of the digitized analog data. This will not be discussed further as it is \nthe means of accomplishing the second step which concerns us here. Specif(cid:173)\nically, in this paper we compare three new approaches to ongoing wave(cid:173)\nform classification which deal explicitly with overlapping spikes and vari(cid:173)\nably meet other design criteria outlined above. These approaches consist of \na modified template matching scheme, and two applied neural network im(cid:173)\nplementations. We will first consider the neural network approaches. On \na point of nomenclature, to avoid confusion in what follows, the real neu(cid:173)\nrons whose signals we want to classify will be referred to as \"neurons\" while \ncomputing elements in the applied neural networks will be called \"Hopons.\" \n\nNeural Network Approach - Overall, the problem of classifying neural \nwaveforms can best be seen as an optimization problem in the presence of \nnoise. Much recent work on neural-type network algorithms has demonstrated \nthat these networks work quite well on problems of this sort6-8. In particular, \nin a recent paper Hopfield and Tank describe an A/D converter network and \nsuggest how to map the problem of template matching into a similar context8. \nThe energy functional for the network they propose has the form: \n\nE = -\n\n- 1 \n'\" '\" T. v..v. - '\" VI \n2 ~~ I] I ] ~ I I \n\n1 \n\n] \n\n1 \n\n(1) \n\nwhere Tij = connectivity between Hopon i and Hopon y', V; = voltage output \nof Hopon i, Ii = input current to Hopon i and each Hopon has a sigmoid \ninput-output characteristic V = g(u) = 1/(1 + exp( -au)). \n\nIf the equation of motion is set to be: \n\ndu;fdt = -oE/oV = L T;jVj + Ii \n\nj \n\n(la) \n\nthen we see that dE/dt = -(I:iTijVj + Ii)dV/dt = - (du/dt)(dV/dt) = \n-g'{u)(du/dt)2 :s: O. Hence E will go to to a minimum which, in a network \nconstructed as described below, will correspond to a proposed solution to a \nparticular waveform classification problem. \n\nTemplate Matching using a Hopfield-type Neural Net - We have \ntaken the following approach to template matching using a neural network. For \nsimplicity, we initially restricted the classification problem to one involving two \nwaveforms and have accordingly constructed a neural network made up of two \ngroups of Hopons, each concerned with discriminating one or the other wave(cid:173)\nform. The classification procedure works as follows: first, a Schmidt trigger \n\n\f107 \n\nis used to detect the presence of a voltage on the signal channel above a set \nthreshold. When this threshold is crossed, implying the presence of a possible \nneural signal, 2 msecs of data around the crossing are stored in a buffer (40 \nsamples at 20 KHz). Note that biophysical limitations assure that a single real \nneuron cannot discharge more than once in this time period, so only one wave(cid:173)\nform of a particular type can occur in this data sample. Also, action potentials \nare of the order of 1 msec in duration, so the 2 msec window will include the full \nsignal for single or overlapped waveforms. In the next step (explained later) \nthe data values are correlated and passed into a Hopfield network designed to \nminimize the mean-square error between the actual data and the linear com(cid:173)\nbination of different delays of the templates. Each Hopon in the set of Hopons \nconcerned with one waveform represents a particular temporal delay in the \noccurrence of that waveform in the buffer. To express the network in terms of \nan energy function formulation: Let x(t) = input waveform amplitude in the \ntth time bin, Sj(t) = amplitude of the ph template, Vjk denote if Sj(t - k)(J\u00b7th \ntemplate delayed by k time bins)is present in the input waveform. Then the \nappropriate energy function is: \n\n(2) \n\nThe first term is designed to minimize the mean-square error and specifies \nthe best match. Since V E [0,1]' the second term is minimized only when each \nVjk assumes values 0 or 1. It also sets the diagonal elements Tij to o. The \nthird term creates mutual inhibition among the processing nodes evaluating \nthe same neuronal signal, which as described above can only occur once per \nsample. \n\nExpanding and simplifying expression (2), the connection matrix is : \n\nand the input current \n\n(3a) \n\n(3b) \n\nAs it can be seen, the inputs are the correlations between the actual data and \nthe various delays of the templates subtracting a constant term. \n\nModified Hopfield Network - As documented in more detail in Fig. \n3-4, the above full Hopfield-type network works well for temporally isolated \nspikes at moderate noise levels, but for overlapping spikes it has a local minima \nproblem. This is more severe with more than two waveforms in the network. \n\n\f108 \n\nFurther, we need to build our network in hardware and the full Hopfield net(cid:173)\nwork is difficult to implement with current technology (see below) . For these \nreasons, we developed a modified neural network approach which significantly \nreduces the necessary hardware complexity and also has improved performance. \nTo understand how this works, let us look at the information contained in the \nquantities Tij and Iij (eq. 3a and 3b ) and make some use of them. These \nquantities have to be calculated at a pre-processing stage before being loaded \ninto the Hopfield network. If after calculating these quantities, we can quickly \nrule out a large number of possible template combinations, then we can sig(cid:173)\nnificantly reduce the size of the problem and thus use a much smaller (and \nhence more efficient) neural network to find the optimal solution. To make the \nderivation simple, we define slightly modified versions of 1';j and Iij (eq. 4a \nand 4b) for two-template case. \n\nIij = L x(t) [~SI(t - i) + ~S2(t - j)] - ~ L si(t - i) - ~ L s~(t - j) \n\n(4b) \n\nt \n\nt \n\nt \n\nIn the case of overlaping spikes the 1';j'S are the cross-correlations between SI (t) \nand S2(t) with different delays and Ii;'s are the cross-correlations between input \nx(t) and weighted combination of SI(t) and S2(t). Now if x(t) = SI(t - i) + \nS2(t - J') (i.e. the overlap of the first template with i time bin delay and the \nsecond template with j time bin delay), then I:::.ij = l1';j -\nIijl = O. However \nin the presence of noise, I:::.ij will not be identically zero, but will equal to the \nnoise, and if I:::.ij > l:::.1';j (where l:::.1';j = l1';j - 1';'j.1 for i =f: i' and j =f: l) this \nsimple algorithm may make unacceptable errors. A solution to this problem \nfor overlapping spikes will be described below, but now let us consider the \nproblem of classifying non-overlapping spikes. In this case, we can compare \nthe input cross-correlation with the auto-correlations (eq. 4c and 4d). \n\nT! = Lsi(t - i); T!, = Ls~(t - i) \n\nt \n\n( 4c) \n\n(4d) \n\nSo for non-overlapping cases, if x(t) = SI(t - i), then I:::.~ = IT: - 1:1 = O. If \nx(t) = S2(t - i), then 1:::.:' = IT:' - 1:'1 = o. \nIn the absence of noise, then the minimum of I:::.ij , 1:::.: and I:::.? represents the \ncorrect classification. However, in the presence of noise, none of these quantities \nwill be identically zero, but will equal the noise in the input x(t) which will \ngive rise to unacceptible errors. Our solution to this noise related. problem is \nto choose a few minima (three have chosen in our case) instead of one. For \neach minimum there is either a known corresponding linear combination of \ntemplates for overlapping cases or a simple template for non-overlapping cases. \nA three neuron Hopfield-type network is then programmed so that each neuron \ncorresponds to each of the cases. The input x(t) is fed to this tiny network to \nresolve whatever confusion remains after the first step of \"cross-correlation\" \ncomparisons. (Note: Simple template matching as described below can also be \nused in the place of the tiny Hopfield type network.) \n\n\f109 \n\nSimple Template Matching ~ To evaluate the performances of these \n\nneural network approaches, we decided to implement a simple template match(cid:173)\ning scheme, which we will now describe. However, as documented below, this \napproach turned out to be the most accurate and require the least complex \nhardware of any of the three approaches. The first step is, again, to fill a buffer \nwith data based on the detection of a possible neural signal. Then we calculate \nthe difference between the recorded waveform and all possible combinations of \nthe two previously identified templates. Formally, this consists of calculating \nthe distances between the input x(m) and all possible cases generated by all \nthe combinations of the two templates. \n\nd,j = L Ix(t) - {Sl(t - i) + S2(t - Jonl \n\nt \n\nd~ = L Ix(t) - Sl(t - i)l; \n\nt \n\nd~' = L Ix(t) - S2(t - i)1 \n\nt \n\ndmin = min(dij,d~,dn \n\ndm,n gives the best fit of all possible combinations of templates to the actual \nvoltage signal. \n\nTESTING PROCEDURES \n\nTo compare the performance of each of the three approaches, we devised a \ncommon set of test data using the following procedures. First, we used the prin-\ncipal component method of Abeles and Goldstein3 to generate two templates \nfrom a digitized analog record of neural activity recorded in the cerebellum \nof the rat. The two actual spike waveform templates we decided to use had \na peak-to-peak ratio of 1.375. From a second set of analog recordings made \nfrom a site in the cerebellum in which no action potential events were evident, \nwe determined the spectral characteristics of the recording noise. These two \ncomponents derived from real neural recordings were then digitally combined, \nthe objective being to construct realistic records, while also knowing absolutely \nwhat the correct solution to the template matching problem was for each oc(cid:173)\ncurring spike. As shown in Fig. 2c and 2d, data sets corresponding to different \nnoise to signal ratios were constructed. We also carried out simulations with \nthe amplitudes of the templates themselves varied in the synthesized records to \nsimulate waveform changes due to brain movements often seen in real record(cid:173)\nings. In addition to two waveform test sets, we also constructed three waveform \nsets by generating a third template that was the average of the first two tem(cid:173)\nplates. To further quantify the comparisons of the three diffferent approaches \ndescribed above we considered non-overlapping and overlapping spikes sepa(cid:173)\nrately. To quantify the performance of the three different approaches, two \nstandards for classification were devised. In the first and hardest case, to be \njudged a correct classification, the precise order and timing of two waveforms \nhad to be reconstructed. In the second and looser scheme, classification was \njudged correct if the order of two waveforms was correct but timing was al(cid:173)\nlowed to vary by \u00b1lOO Jlsecs(i.e. \u00b12 time bins) which for most neurobiological \napplications is probably sufficient resolution. Figs. 3-4 compare the perfor(cid:173)\nmance results for the three approaches to waveform classification implemented \nas digital simulations. \n\n\f110 \n\nPERFORMANCE COMPARISON \n\nTwo templates - non-overlapping waveforms: As shown in Fig. 3a, at \nlow noise-to-signal ratios (NSRs below .2) each of the three approaches were \ncomparable in performance reaching close to 100% accuracy for each criterion. \nAs the ratio was increased, however the neural network implementations did \nless and less well with respect to the simple template matching algorithm with \nthe full Hopfield type network doing considerably worse than the modified \nnetwork. In the range of NSR most often found in real data (.2 - .4) simple \ntemplate matching performed considerably better than either of the neural \nnetwork approaches. Also it is to be noted that simple template matching \ngives an estimate of the goodness of fit betwwen the waveform and the closest \ntemplate which could be used to identify events that should not be classified \n(e.g. signals due to noise). \n\na. , . \n\n.. \n\nc. ,. \n\n\u2022 \n\n.. \n\nb. \n\n1.1 \n\n.. \n\n.. \n\n.. .. \n\n.. \n\n.. . . 1.1 \n\nnoise level: 3a/peak amplitude \n\nnoise level: 3a/peak amplitude \n\n. , , \n// \\, \n\n/ , , , , , , , ,: \n\n, . \n\nI \nI \n\nI \n:' \nI \nI \n,I \n\n\\,' \n\n,.-.-. \n\n. .-----------(cid:173)\n\n-14 \n\n-12 \n\n-tli \n\n-I \n\n-2 \n\n12 \n\ndegrees of overlap \n\nlight line -\nheavy line -\n\nabsolute criteria \n\nless stringent criteria \n\nsimple template matching \nHopfield network \nmodified Hopfield network \n\nFig. 3. Comparisons of the three approaches detecting two non-overlapping (a), and over(cid:173)\nlapping (b) waveforms, c) compares the performances of the neural network approaches for \ndifferent degrees of waveform overlap. \n\nTwo' templates - overlapping waveforms: Fig. 3b and 3c compare perfor(cid:173)\nmances when waveforms overlapped. In Fig. 3b the serious local minima prob(cid:173)\nlem encountered in the full neural network is demonstrated as is the improved \nperformance of the modified network. Again, overall performance in physi-\n\n\f111 \n\nological ranges of noise is clearly best for simple template matching. When \nthe noise level is low, the modified approach is the bet ter of the two neural \nnetworks due to the reliability of the correlation number which reflects the \nresemblence between the input data and the template. When the noise level \nis high, errors in the correlation numbers may exclude the right combination \nfrom the smaller network. In this case its performance is actually a little worse \nthan the larger Hopfield network. Fig. 3c documents in detail which degrees \nof overlap produce the most trouble for the neural network approaches at av(cid:173)\nerage NSR levels found in real neural data. It can be seen that for the neural \nnetworks, the most serious problem is encountered when the delays between \nthe two waveforms are small enough that the resulting waveform looks like the \nlarger waveform with some perturbation. \n\nThree templates - overlapping and non-overlapping: In Fig. 4 are shown \nthe comparisons between the full Hopfield network approach and the simple \ntemplate matching approach. For nonoverlapping waveforms, the performance \nof these two approaches is much more comparable than for the two waveform \ncase (Fig. 4a), although simple template matching is still the optimal method. \nIn the overlapping waveform condition, however, the neural network approach \nfails badly (Fig. 4b and 4c). For this particular application and implementa(cid:173)\ntion, the neural network approach does not scale well. \n\na. \n\n~ . !:! ... o .. \n~ .. \n\nv \n\n28 \n\nc. \n\n~ .. \n'\" ... ... o 50 \n~ .. \n\nV \n\n2. \n\nb. \n\n.. \n\n.. .S \n\n.2 \n\n1. 1 \nnoise level: 3a /peak amplitude \n\n.. \n\n.S \n\n.4 \n\n.. I \n.2 \nnoise level: 3a /peak amplitude \n\nHopfield network \nsimple template matching \n\nabsolute criteria \n\nlight line -\nheavy line -\na = variance of the noise \n\nless stringent criteria \n\n.2 \n\n.6 \n\n. 8 \n\n1. \u2022 \n\nnoise level: 3a /peak amplitude \n\nFig. 4. Comparisons of performance for three waveforms. a) nonoverlapping waveforms; b) \ntwo waveforms overlapping; c) three waveforms overlapping. \n\nHARDWARE COMPARISONS \n\nAs described earlier, an important design requi~ement for this work was the \nability to