{"title": "An Orientation Selective Neural Network for Pattern Identification in Particle Detectors", "book": "Advances in Neural Information Processing Systems", "page_first": 925, "page_last": 931, "abstract": null, "full_text": "An Orientation Selective Neural Network \n\nfor Pattern Identification in Particle \n\nDetectors \n\nHalina Abramowicz, David Horn, Ury Naftaly, Carmit Sahar- Pikielny \n\nSchool of Physics and Astronomy, Tel Aviv University \n\nTel Aviv 69978, Israel \n\nhalinaOpost.tau.ac.il, horn~neuron.tau.ac.il \nury~ost.tau.ac.il, carmitOpost.tau.ac.il \n\nAbstract \n\nWe present an algorithm for identifying linear patterns on a two(cid:173)\ndimensional lattice based on the concept of an orientation selective \ncell, a concept borrowed from neurobiology of vision. Construct(cid:173)\ning a multi-layered neural network with fixed architecture which \nimplements orientation selectivity, we define output elements cor(cid:173)\nresponding to different orientations, which allow us to make a se(cid:173)\nlection decision. The algorithm takes into account the granularity \nof the lattice as well as the presence of noise and inefficiencies. The \nmethod is applied to a sample of data collected with the ZEUS \ndetector at HERA in order to identify cosmic muons that leave \na linear pattern of signals in the segmented calorimeter. A two \ndimensional representation of the relevant part of the detector is \nused. The algorithm performs very well. Given its architecture, \nthis system becomes a good candidate for fast pattern recognition \nin parallel processing devices. \n\nI \n\nIntroduction \n\nA typical problem in experiments performed at high energy accelerators aimed at \nstudying novel effects in the field of Elementary Particle Physics is that of prese(cid:173)\nlecting interesting interactions at as early a stage as possible, in order to keep the \ndata volume manageable. One class of events that have to be eliminated is due to \ncosmic muons that pass all trigger conditions. \n\n\f926 \n\nH. Abramowicz. D. Hom. U. Naftaly and C. Sahar-Pikielny \n\nThe most characteristic feature of cosmic muons is that they leave in the detector \na path of signals aligned along a straight line. The efficiency of pattern recognition \nalgorithms depends strongly on the granularity with which such a line is probed, on \nthe level of noise and the response efficiency of a given detector. Yet the efficiency \nof a visual scan is fairly independent of those features [1]. This lead us to look for \na new approach through application of ideas from the field of vision. \n\nThe main tool that we borrow from the neuronal circuitry of the visual cortex is \nthe orientation selective simple cell [2]. It is incorporated in the hidden layers of a \nfeed forward neural network, possessing a predefined receptive field with excitatory \nand inhibitory connections. Using these elements we have developed [3] a method \nfor identifying straight lines of varying slopes and lengths on a grid with limited \nresolution. This method is then applied to the problem of identifying cosmic muons \nin accelerator data, and compared with other tools. \n\nBy using a network with a fixed architecture we deviate from conventional ap(cid:173)\nproaches of neural networks in particle physics [4]. One advantage of this approach \nis that the number of free parameters is small, and it can, therefore, be determined \nusing a small data set. The second advantage is the fact that it opens up the pos(cid:173)\nsibility of a relatively simple implementation in hardware. This is an important \nfeature for particle detectors, since high energy physics experiments are expected \nto produce in the next decade a flux of data that is higher than present analysis \nmethods can cope with. \n\nII Description of the Task \n\nIn a two-dimensional representation, the granularity of the rear part of the ZEUS \ncalorimeter [6] can be emulated roughly by a 23 x 23 lattice of 20 x 20 cm2 squares. \nWhile such a representation does not use the full information available in the de(cid:173)\ntector, it is sufficient for our study. In our language each cell of this lattice will \nbe denoted as a pixel. A pixel is activated if the corresponding calorimeter cell is \nabove a threshold level predetermined by the properties of the detector. \n\nFigure 1: Example of patterns corresponding to a cosmic muon (left), a typical \naccelerator event (middle), and an accelerator event that looks like a muon (right), \nas seen in a two dimensional projection. \n\nA cosmic muon, depending on its angle of incidence, activates along its linear path \ntypically from 3 to 25 neighboring pixels anywhere on the 23 x 23 grid. The pattern \nof signals generated by accelerator events consists on average of 3 to 8 clusters, \nof typically 4 adjacent activated pixels, separated by empty pixels. The clusters \n\n\fOrientation Selective Neural Network \n\n927 \n\ntend to populate the center of the 23 X 23 lattice. Due to inherent dynamics of \nthe interactions under study, the distribution of clusters is not isotropic. Examples \nof events, as seen in the two-dimensional projection in the rear part of the ZEUS \ncalorimeter, are shown in figure 1. \n\nThe lattice discretizes the data and distorts it. Adding conventional noise levels, \nthe decision of classification of the data into accelerator events and cosmic muon \nevents is difficult to obtain through automatic means. Yet, it is the feeling of exper(cid:173)\nimentalists dealing with these problems, that any expert can distinguish between \nthe two cases with high efficiency (identifying a muon as such) and purity (not \nmisidentifying an accelerator event) . We define our task as developing automatic \nmeans of doing the same. \n\nIII The Orientation Selective Neural Network \n\nOur analysis is based on a network of orientation selective neurons (OSNN) that \nwill be described in this chapter. We start out with an input layer of pixels on a \ntwo dimensional grid with discrete labeling i = (x, y) of the neuron (pixel) that can \nget the values Sj = 1 or 0, depending on whether the pixel is activated or not. \n\nFigure 2: Connectivity patterns for orientation selective cells on the second layer \nof the OSNN. From left to right are examples of orientations of 0, 7r/4 and 57r/8. \nNon-zero weights are defined only within a 5 X 5 grid. The dark pixels have weights \nof +1, and the grey ones have weights of -1. White pixels have zero weights. \n\nThe input is being fed into a second layer that is composed of orientation selective \nneurons V;\u00b7a at location i with orientation Oa where a belongs to a discrete set of \n16 labels, i.e. Oa = a7r/16. The neuron V;\u00b7a is the analog of a simple cell in the \nvisual cortex. Its receptive field consists of an array of dimension 5 X 5 centered at \npixel i. Examples of the connectivity, for three different choices of a, are shown in \nFig. 2. The weights take the values of 1,0 and -1. \n\nThe second layer consists then of 23 x 23 x 16 neurons, each of which may be thought \nof as one of 16 orientation elements at some (x, y) location of the input layer. Next \nwe employ a modified Winner Take All (WTA) algorithm, selecting the leading \norientation amax(i) for which the largest V;\u00b7a is obtained at the given location i. \nIf we find that several V;\u00b7a at the same location i are close in value to the maximal \none, we allow up to five different V;\u00b7a neurons to remain active at this stage of the \nprocessing, provided they all lie within a sector of amax \u00b1 2, or Omax \u00b1 7r /8. All \nother V;\u00b7a are reset to zero. If, however, at a given location i we obtain several \n\n\f928 \nlarge values of V;\u00b7a that correspond to non-neighboring orientations, all are being \n\nH. Abramowicz, D. Hom, U. Naftaly and C. Sahar-Pikielny \n\ndiscarded . \n\nThe third layer also consists of orientation selective cells. They are constructed \nwith a receptive field of size 7 x 7, and receive inputs from neurons with the same \norientation on the second layer. The weights on this layer are defined in a similar \nfashion to the previous ones, but here negative weights are assigned the value of \n-3 , not -1. For linear patterns, the purpose of this layer is to fill in the holes due \nto fluctuations in the pixel activation, i.e. complete the lines of same orientation \nof the second layer. As before, we keep also here up to five highest values at each \nlocation , following the same WTA procedure as on the second layer. \n\nThe fourth layer of the OSNN consists of only 16 components, D a , each corre(cid:173)\nsponding to one of the discrete orientations Q' . For each orientation we calculate \nthe convolution of the first and third layers, D a = l:i vi\u00b7a Si . The elements \nD a carry the information about the number of the input pixels that contribute to a \ngiven orientation (}a . Cosmic muons are characterized by high values of D a whereas \naccelerator events possess low values, as shown in figure 3 below. \nThe computational complexity of this algorithm is O( n) where n is the number of \npixels, since a constant number of operations is performed on each pixel. There \nare basically four free parameters in the algorithm. These are the sizes of the \nreceptive fields on the second and third layer and the corresponding activation \nthresholds. Their values can be tuned for the best performance, however they are \nwell constrained by the spatial resolution , the noise level in the system and the \nactivation properties of the input pixels. The size of the receptive field determines \nto a large extent the number of orientations allowed to survive in the modified WTA \nalgorithm. \n\nIV OSNN and a Selection Criterion on the Training Set \n\nThe details of the design of the OSNN and the tuning of its parameters were fixed \nwhile training it on a sample of 250 cosmic muons and a similar amount of acceler(cid:173)\nator events. The sample was obtained by preselection with existing algorithms and \na visual scan as a cross-check. \n\nFor cosmic muon events the highest value of Da , D max , determines the orientation \nof the straight line. In figure 3 we present the correlation between Dmax and the \nnumber np of activated input pixels for cosmic muon and accelerator events. As \nexpected one observes a linear correlation between Dmax and np for the muons \nwhile almost no correlation is observed for accelerator events. This allows us to set \na selection criterion defined by the separator in this figure. We quantify the quality \nof our selection by quoting the efficiency of properly identifying a cosmic muon for \n100% purity, corresponding to no accelerator event misidentified as a muon. In \nOSNN-D , which we define according to the separator shown in Fig 3, we obtain \n93.0% efficiency on the training set. \n\nOn the right hand side of Fig 3 we present results of a conventional method for \ndetecting lines on a grid, the Hough transform \n[7, 8, 9] . This is based on the \nanalysis of a parameter space describing locations and slopes of straight lines. The \ncells of this space with the largest occupation number, N max , are the analogs of \nour Dmax. In the figure we show the correlation of N max with np which allows us \nto draw a separator between cosmic muons and accelerator events, leading to an \nefficiency of 88% for 100% purity. Although this number is not much lower than the \n\n\fOrientation Selective Neural Network \n\n929 \n\n0_ \n\n. . \n\n30 \n\nHough \n\n. : . \n...... . \n\n. . .-' \" \n\n10 \n\n11 \n\n20 \n\n21 \n\n3G \n\n\u2022 \n\n40 \n\n10 \n\n15 \n\n10 \n\n25 \n\n30 \n\n35 \n\n40 \n\nFigure 3: Left: Correlation between the maximum value of DOt , D max , and the \nnumber np of input pixels for cosmic muon (dots) and accelerator events (open cir(cid:173)\ncles) . The dashed line defines a separator such that all events above it correspond \nto cosmic muons (100% purity). This selection criterion has 93% efficiency. Right: \nUsing the Hough Transform method, we compare the values of the largest accumu(cid:173)\nlation cell N max with np and find that the two types of events have different slopes, \nthus allowing also the definition of a separator. In this case, the efficiency is 88%. \n\nefficiency of OSNN-D, we note that the difference between the two types of event \ndistributions is not as significant as in OSNN-D. In the test set, to be discussed in \nthe next chapter, we will consider 40 ,000 accelerator events contaminated by less \nthan 100 cosmic muons. Clearly the expected generalization quality of OSNN-D \nwill be higher than that of the Hough transform. It should of course be noted that \nthe OSNN is a multi-layer network, whereas the Hough transform method that we \nhave described is a single-layer operation, i.e. it calculates global characteristics. If \none wishes to employ some quasi-local Hough transform one is naturally led back \nto a network that has to resemble our OSNN. \n\nV Training and Testing of OSNN-S \n\nIf instead of applying a simple cut we employ an auxiliary neural network to search \nfor the best classification of events using the OSNN outputs, we obtain still better \nresults. The auxiliary network has 6 inputs, one hidden layer with 5 nodes and one \noutput unit. The input consists of a set of five consecutive DOt centered around \nDmax and the total number of activated input pixels, np ' The cosmic muons are \nassigned an output value s = 1 and the accelerator events s = O. The net is trained \non our sample with error back-propagation. This results in an improved separation \nof cosmic muon events from the rest. Whereas in OSNN-D we find a continuum \nof cosmic muons throughout the range of Dmax , here we obtain a clear bimodal \ndistribution , as seen in Figure 4. For s ~ 0.1 no accelerator events are found and \nthe muons are selected with an efficiency of 94.7% . This selection procedure will be \ndenoted as OSNN-S. \nAs a test of our method we apply OSNN-S to a sample of 38,606 data events \n\n\f930 \n\nH. Abramowicz, D. Hom, U. Naftaly and C. Sahar-Pikielny \n\nthat passed the standard physics cuts [5] . The distribution of the neural network \noutput s is presented in Figure 4. It looks very different from the one obtained with \nthe training sample. Whereas the former consisted of approximately 500 events \ndistributed equally among accelerator events and cosmic muons, this one contains \nmostly accelerator events, with a fraction of a percent of muons. This proportion is \ncharacteristic of physics samples. The vast majority of accelerator events are found \nin the first bin, but a long tail extends throughout s. The last bin in s is indeed \ndominated by cosmic muons. \n\nWe performed a visual scan of all 181 events with s ~ 0.1 using the full information \nfrom the detector. This allowed us to identify the cosmic-muon events represented \nby shaded areas in figure 4. For s ~ 0.1 we find 55 cosmic-muon events and \n123 accelerator events, 55 of which resemble muons on the rear segment of the \ncalorimeter. The latter, together with the genuine cosmic muons, populate mainly \nthe region of large s values. \n\nWe conclude that our method picked out the cosmic muons from the very large \nsample of data, in spite of the fact that it relied just on two-dimensional infor(cid:173)\nmation from the rear part of the detector. This fact is, however, responsible for \nthe contamination of the high s region by accelerator events that resemble cosmic \nmuons. Even with all its limitations, our method reduces the problem of rejecting \ncosmic-muon events down to scanning less than one percent of all the events. We \nconclude that we have achieved the goal that we set for ourselves, that of replacing \na laborious visual scan by a computer algorithm with similar reliability. \n\nOSNN-S (Train) \n\nN \n\nTest \n\nN \n\n10 \n\n10 \n\n0.4 \n\n0 .\u2022 \n\no.a \n\ns \n\nI \n\nFigure 4: Left: Number of events as a function of the output s of an auxiliary neural \nnet. Choosing the separator to be s = 0.1 we obtain an efficiency of 94.7% on our \ntraining set. This bimodal distribution holds the promise of better generalization \nthan the OSNN-D method depicted in Figure 3. Muons are represented by shaded \nareas. Right: Distribution of the auxiliary neural network output s obtained with \nthe OSNN-S selector for the test sample of 38,606 events. The tail of the distribution \nof accelerator events leads to 123 accelerator events with s > 0.1, including 55 that \nresemble straight lines on the input layer. 55 genuine cosmic muons were identified \nin the high s region. \n\n\fOrientation Selective Neural Network \n\nVI Summary \n\n931 \n\nWe have presented an algorithm for identifying linear patterns on a two-dimensional \nlattice based on the concept of an orientation selective cell, a concept borrowed \nfrom neurobiology of vision. Constructing a multi-layered neural network with \nfixed architecture that implements orientation selectivity, we define output elements \ncorresponding to different orientations, that allow us to make a selection decision. \nThe algorithm takes into account the granularity of the lattice as well as the presence \nof noise and inefficiencies. \n\nOur feed-forward network has a fixed set of synaptic weights. Hence, although the \nnumber of neurons is very high, the complexity of the system, as determined by the \nnumber of free parameters, is low. This allows us to train our system on a small \ndata set. We are gratified to see that, nontheless, it generalizes well and performs \nexcellently on a test sample that is larger by two orders of magnitude. \n\nOne may regard our method as a refinement of the Hough transform, since each of \nour orientation selective cells acts as a filter of straight lines on a limited grid. The \nmajor difference from conventional Hough transforms is that we perform semi-local \ncalculations, and proceed in several stages, reflected by the different layers of our \nnetwork, before evaluating global parameters. \n\nThe task that we have set to ourselves in the application described here is only one \nexample of problems of pattern recognition that are encountered in the analysis of \nparticle detectors. Given the large flux of data in these experiments, one is faced \nby two requirements: correct identification and fast performance. Using a structure \nlike our OSNN for data classification, one can naturally meet the speed require(cid:173)\nment through its realization in hardware, taking advantage of the basic features of \ndistributed parallel computation. \n\nAcknow ledgements \n\nWe are indebted to the ZEUS Collaboration for allowing us to use the sample of \ndata for this analysis. This work was partly supported by a grant from the Israel \nScience Foundation. \n\nReferences \n\n[1] ZEUS Collab., The ZEUS Detector, Status Report 1993, DESY 1993; M. Derrick et \n\nal., Phys. Lett. B 293 (1992) 465. \n\n[2] D. H. Hubel and T. N. Wiesel, J. Physiol. 195 (1968) 215. \n[3] H. Abramowicz, D. Horn, U. Naftaly and C. Sahar-Pikielny, Nuclear Instrum. and \n\nMethods in Phys. Res. A378 (1996) 305. \n\n[4] B. Denby, Neural Computation, 5 (1993) 505. \n[5] ZEUS Calorimeter Group, A. Andresen et al., Nucl. Inst. Meth. A 309 (1991) lOI. \n[6] P. V. Hough, \"Methods and means to recognize complex patterns\", U.S. patent \n\n3.069.654. \n\n[7] D. H. Ballard, Pattern Recognition 3 (1981) II. \n[8] R. O. Duda and P. E. Hart, Commun. ACM. 15 (1972) I. \n[9] ZEUS collab., M. Derrick et al., Phys. Lett. B 316 (1993) 412; ZEUS collab., M. \n\nDerrick et al., Zeitschrift f. Physik C 69 (1996) 607-620 \n\n\f", "award": [], "sourceid": 1320, "authors": [{"given_name": "Halina", "family_name": "Abramowicz", "institution": null}, {"given_name": "David", "family_name": "Horn", "institution": null}, {"given_name": "Ury", "family_name": "Naftaly", "institution": null}, {"given_name": "Carmit", "family_name": "Sahar-Pikielny", "institution": null}]}