{"title": "An Analog VLSI Neural Network for Phase-based Machine Vision", "book": "Advances in Neural Information Processing Systems", "page_first": 726, "page_last": 732, "abstract": "", "full_text": "An Analog VLSI Neural Network for Phase(cid:173)\n\nbased Machine Vision \n\nBertram E. Shi \n\nDepartment of Electrical and Electronic \n\nEngineering \n\nHong Kong University of Science and \n\nTechnology \n\nClear Water Bay, Kowloon, Hong Kong \n\nKwokFaiHui \n\nFujitsu Microelectronics Pacific Asia Ltd. \n\nSuite 1015-20, Tower 1 \nGrand Century Place \n\n193 Prince Edward Road West \n\nMongkok, Kowloon, Hong Kong. \n\nAbstract \n\nWe describe the design, fabrication and test results of an analog CMOS \nVLSI neural network prototype chip intended for phase-based machine \nvision algorithms. The chip implements an image filtering operation \nsimilar to Gabor-filtering. Because a Gabor filter's output is complex \nvalued, it can be used to define a phase at every pixel in an image. This \nphase can be used in robust algorithms for disparity estimation and bin(cid:173)\nocular stereo vergence control in stereo vision and for image motion \nanalysis. The chip reported here takes an input image and generates two \noutputs at every pixel corresponding to the real and imaginary parts of \nthe output. \n\n1 \n\nINTRODUCTION \n\nGabor filters are used as preprocessing stages for different tasks in machine vision and \nimage processing. Their use has been partially motivated by findings that two dimensional \nGabor filters can be used to model receptive fields of orientation selective neurons in the \nvisual cortex (Daugman, 1980) and three dimensional spatio-temporal Gabor filters can be \nused to model biological image motion analysis (Adelson, 1985). \nA Gabor filter has a complex valued impulse response which is a complex exponential \nmodulated by a Gaussian function. In one dimension, \n\ng(x) = _1_e-202/OOxox = _1_e -202 (cos (00 x) + jsin (00 x\u00bb \n\nx 2 \n\nx 2 \n\n./2itcr \n\n./2itcr \n\nxo \n\nxo \n\nwhere OOxo and cr are real constants corresponding to the angular frequency of the com(cid:173)\nplex exponential and the standard deviation of the Gaussian. \n\n\fAn Analog VLSI Neural Network/or Phase-based Machine Vision \n\n727 \n\nThe phase of the complex valued filter output at a given pixel is related to the location of \nedges and other features in the input image near that pixel. Because translating the image \ninput results in a phase shift in the Gabor output, several authors have developed \"phase(cid:173)\nbased\" approaches to disparity estimation (Westelius, 1995) and binocular vergence con(cid:173)\ntrol (Theimer, 1994) in stereo vision and image motion analysis (Fleet, 1992). Barron et. \nal.'s comparison (Barron, 1992) of algorithms for optical flow estimation indicates that \nFleet's algorithm is the most accurate among those tested. \nThe remainder of this paper describes the design, fabrication and test results of a prototype \nanalog VLSI continuous time neural network which implements a complex valued filter \nsimilar to the Gabor. \n\n2 NETWORK AND CIRCmT ARCmTECTURE \n\n-sinw l [v (n - l~ [2 +).2 0 1 [v (n)l \n\n0 \n\n[coS(o \n\nThe prototype implements a Cellular Neural Network (CNN) architecture for Gabor-type \nimage filtering (Shi, 1996). It consists of an array of neurons, called \"cells,\" each corre(cid:173)\nsponding to one pixel in the image to be processed. Each cell has two outputs v,(n) and \nvi(n) which evolve over time according to the equation \n\nsinw] [v (n + 1)1 \n2 + ).2J v~n)j + -sin::o cosW:: v~n + 1)j + \n\n[).2u(n)1 \nrv,(n)l \nlvj(n)j = sinw:: cosw:: v~(n - l)j -\n0 J \nwhere A. > 0 and 0)0 E [0,21t] are real constants and u(n) is the input image. The feed(cid:173)\nback from neighbouring cells' outputs enables information to be spread globally through(cid:173)\nout the array. This network has a unique equilibrium point where the outputs correspond to \nthe real and imaginary parts of the result of filtering the image with a complex valued dis(cid:173)\ncrete space convolution kernel which can be approximated by \n\n[cosw \n\ng(n) = ~e-A.lnli!ll .. o(n). \n\n2 \n\nThe Gaussian function of the Gabor filter has been replaced by (A./2) e-A.1xt . The larger A. \nis, the narrower the impulse response and the larger the bandwidth. Figure 1 shows the \nreal (a) and imaginary (b) parts of g(n) for A. = 0.3 and O)xo = 0.93. The dotted lines \nshow the function which modulates the complex exponential. \n\n'. '. , , \n, . \n, . \n, . \n\n(a) \n\n-4:10 \n\n_1~ \n\n_10 \n\n-I \n\n(b) \n\nFigure 1: The Real and Imaginary Parts of the Impulse Response. \n\nIn the circuit implementation of this CNN, each output corresponds to the voltage across a \ncapacitor. We selected the circuit architecture in Figure 2 because it was the least sensitive \nto the effects of random parameter variations among those we considered (Hui, 1996). In \nthe figure, resistor labels denote conductances and trapezoidal blocks represent transcon(cid:173)\nductance amplifiers labelled by their gains. \n\n\f728 \n\nB. E. Shi and K. F. Hui \n\nFigure 2: Circuit Implementation of One Neuron. \n\nThe circuit implementation also gives good intuitive understanding of the CNN's opera(cid:173)\ntion. Assume that the input image is an impulse at pixel n. In the circuit, this corresponds \nto setting the current source A.2u(n) to 1..2 amps and setting the remaining current sources \nto zero. If the gains and conductances were chosen so that').. = 0.3 and w.w = 0.93. then \nthe steady state voltages across the lower capacitors would follow the spatial distribution \nshown in Figure l(a) where the center peak occurs at cell n and the voltages across the \nupper capacitors would follow the distribution shown in Figure l(b). To see how this \nwould arise in the circuit, consider the current supplied by the source ')..2u(n) . Part of the \ncurrent flows through the conductance Go pushing the voltage v,(n) positive. As this \nvoltage increases, the two resistors with conductance G1 cause a smoothing effect which \npulls the voltages v,(n-l) and v,(n + 1) up towards v,(n). Current also flows through \nthe diagonal resistor with conductance G2 pulling vj(n + 1) positive as well. At the same \ntime, the transconductance amplifier with input v,(n) draws current from node vj(n - 1) \npushing vj(n - 1) negative. The larger G2 , the more the voltages at nodes vj(n - 1) and \nvj(n + 1) are pushed negative and positive. On the other hand, the larger G1 ' the greater \nthe smoothing between nodes. Thus, the larger the ratio \n\nsinwxo \n- - - = tanwxo ' \ncoswxo \n\nthe higher the spatial frequency wxo at which the impulse response oscillates. \n\n3 DESIGN OF CMOS BUILDING BLOCKS \n\nThis section describes CMOS transistor circuits which implement the transconductance \namplifiers and resistors in Figure 2. It is not necessary to implement the capacitors explic(cid:173)\nitly. Since the equilibrium point of the CNN is unique, the parasitic capacitances of the cir(cid:173)\ncuit are sufficient to ensure the circuit operates correctly. \n\n\u00b7TRANSCONDUCTANCE AMPLIFIER \n\n3.1 \nThe transconductance amplifiers can be implemented using the circuit shown in \nFigure 3(a). For Yin == V GND' the output current is approximately lout = Jf3/ss Yin where \nf3 n = Iln Cox ( W / L) and (W / L) is the widthllength ratio of the differential pair. The \ntransistors in the current mirrors are assumed to be matched. Using cascoded current mir-\n\n\fAn Analog VLSI Neural Networkfor Phase-based Machine l-lsion \n\n729 \n\nrors decreases static errors such as offsets caused by the finite output impedance of the \nMOS transistors in saturation. \n\n(a) \n\n(b) \n\nFigure 3: The CMOS Circuits Implementing OTAs and Resistors \n\n3.2 RESISTORS \nSince the convolution kernels implemented are modulated sine and cosine functions, the \nnodal voltages ve(n) and v o(n) can be both positive and negative with respect to the \nground potential. The resistors in the circuit must be floating and exhibit good linearity \nand invariance to common mode offsets for voltages around the ground potential. Many \nMOS resistor circuits require bias circuitry implemented at every resistor. Since for image \nprocessing tasks, we are interested in maximizing the number of pixels processed, elimi(cid:173)\nnating the need for bias circuitry at each cell will decrease its area and in turn increase the \nnumber of cells implementable within a given area. \n\nFigure 3(b) shows a resistor circuit which satisfies the requirements above. This circuit is \nessentially a CMOS transmission gate with adjustable gate voltages. The global bias cir(cid:173)\ncuit which generates the gate voltages in the CMOS resistor is shown on the left. The gate \nbias voltages V GI and V G2 are distributed to each resistor designed with the same value. \nBoth transistors Mn and Mp operate in the conduction region where (Enz, 1995) \nIDn = nn~n(Vpn- VD;VS)(VD_VS) andIDp = -np~p(vpp- VD;VS)(VD_VS) \n\nand V Pn and V P are nonlinear functions of the gate and threshold voltages. The sizing of \nthe NMOS and PMOS transistors can be chosen to decrease the effect of the nonlinearity \n\n\f730 \n\nB. E. Shi and K. F. Hui \n\ndue to the (V D + Vs) 12 tenns. The conductance of the resistors can be adjusted using \n[bias . \n\n3.3 LIMITATIONS \nDue to the physical constraints of the circuit realizations, not all values of A. and ooxo can \nbe realized. Because the conductance values are non-negative and the OTA gains are non(cid:173)\npositive both G 1 and G2 must be non-negative. This implies that ooxo must lie between 0 \nand 1t/2. Because the conductance Go is non-negative, 1..2 ~ - 2 + 2cosooxo + sinooxo ' \nFigure 4 shows the range of center frequencies ooxo (nonnalized by 1t) and relative band(cid:173)\nwidths (2A./ooxo ) achievable by this realization. Not all bandwidths are achievable for \nooxo ~ 2atanO.5 == 0.31t . \n\nI \nI \n\n07 \n\n0.1 \n\n01 \n\nFigure 4: The filter parameters implementable by the circuit realization. \n\n4 TEST RESULTS \n\nThe circuit architecture and CMOS building blocks described above were fabricated using \nthe Orbit 2Jlm n-well process available through MOSIS. In this prototype, a 13 cell one \ndimensional array was fabricated on a 2.2mm square die. The value of ooxo is fixed at \n2 atan 0.5 == 0.927 by transistor sizing. This is the smallest spatial frequency for which all \nbandwidths can be obtained. In addition, Go = 1..2 for this value of ooxo . The width of the \nimpulse response is adjustable by changing the externally supplied bias current shown in \nFigure 3(b) controlling Go . \nThe transconductance amplifiers and resistors are designed to operate between \u00b1300m V . \nThe currents representing the input image are provided by transconductance amplifiers \ninternal to the chip which are controlled by externally applied voltages. Outputs are read \noff the chip in analog fonn through two common read-out amplifiers: one for the real part \nof the impulse response and one for the imaginary part. The outputs of the cells are con(cid:173)\nnected in tum to the inputs of the read-out amplifier through transmission gates controlled \nby a shift register. The chip requires \u00b14 V supplies and dissipates 35m W. \nTo measure the impulse response of the filters, we applied 150m V to the input correspond(cid:173)\ning to the middle cell of the array and OV to the remaining inputs. The output voltages \nfrom one chip as a function of cell number are shown as solid lines in Figure 5(a, b). To \ncorrect for DC offsets, we also measured the output voltages when all of the inputs were \ngrounded, as shown by the dashed lines in the figure. The DC offsets can be separated into \ntwo components: a constant offset common to all cells in the array and a small offset \nwhich varies from cell to cell. For the chip shown, the constant offset is approximately \n\n\fAn Analog VISI Neural Networkfor Phase-based Machine VISion \n\n731 \n\n(a) \n\n(c) \n\n(b) \n\n(d) \n\nFigure 5: DC Measurements from the Prototype \n\nl00mV and the small variations have a standard deviation of 2OmY. These results are con(cid:173)\nsistent with the other chips. The constant offset is primarily due to the offset voltage in the \nread-out amplifier. The small variations from cell to cell are the result of both parameter \nvariations from cell to cell and offsets in the transconductance amplifiers of each cell. \nBy subtracting the DC zero-input offsets at each cell from the outputs, we can observe that \nthe impulse response closely matches that predicted by the theory. The dotted lines in \nFigure 5(c, d) show the offset corrected outputs for the same chip as shown in Figure 5(a, \nb). The solid lines shows the theoretical output of the chip using parameters A. and O)xo \nchosen to minimize the mean squared error between the theory and the data. The chip was \ndesigned for A. = 0.210 and O)xo = 0.927 . The parameters for the best fit are A. = 0.175 \nand O)xo = 0.941 . The signal to noise ratio, as defined by the energy in the theoretical \noutput divided by the energy in the error between theory and data, is 19.3dB. Similar \nmeasurements from two other chips gave SIgnal to noise ratios of 29.OdB (A. = 0.265, \nO)xo = 0.928) and 30.6dB (A. = 0.200, O)xo = 0.938). \nTo measure the speed of the chips, we grounded all of the inputs except that of the middle \ncell to which we attached a function generator generating a square wave switching \nbetween \u00b1200mV. The rise times (10% to 90%) at the output of the chip for each cell \nwere measured and ranged between 340 and 528 nanoseconds. The settling times will not \nincrease if the number of cells increases since the outputs are computed in parallel. The \nsettling time is primarily determined by the width of the impUlse response. The wider the \nimpulse response, the farther information must propagate through the array and the slower \nthe settling time. \n\n\f732 \n\n5 CONCLUSION \n\nB. E Shi and K. F. Hui \n\nWe have described the architecture, design and test results from an analog VLSI prototype \nof a neural network which filters images with convolution kernels similar to those of the \nGabor filter. Our future work on chip design includes fabricating chips with larger num(cid:173)\nbers of cells, two dimensional arrays and chips with integrated photosensors which \nacquire and process images simultaneously. We are also investigating the use of these neu(cid:173)\nral network chips in binocular vergence control of an active stereo vision system. \n\nAcknowledgements \nThis work was supported by the Hong Kong Research Grants Council (RGC) under grant \nnumber HKUST675/95E. \n\nReferences \nE. H. Adelson, and J. R. Bergen, \"Spatiotemporal energy models for the perception of \nmotion\",1. Optical Society of America A, vol. 2, pp. 284-299, Feb. 1985. \nJ. Barron, D. S. Fleet, S. S. Beauchemin, and T. A. Burkitt, \"Performance of optical flow \ntechniques,\" in Proc. ofCVPR, (Champaign, IL), pp. 236-242, IEEE, 1992. \nJ. G. Daugman, \"Two-dimensional spectral analysis of cortical receptive field profiles,\" \nVision Research, vol. 20, pp. 847-856, 1980. \nC. C. Enz, F. Krummenacher, and E. A. Vittoz, \"An analytical MaS transistor model valid \nin all regions of operation and dedicated to low-voltage and low-current applications,\" \nAnalog Integrated Circuits and Signal Processing, vol.8, no.l, p83-114, Jut 1995. \nD. J. Fleet, Measurement of Image Velocity, Boston. MA: Kluwer Academic Publishers, \n1992. \nK. F. Hui and B. E. Shi, \"Robustness of CNN Implementations for Gabor-type Filtering,\" \nProc. of Asia Pacific Conference on Circuits and Systems, pp. 105-108, Nov. 1996. \nB. E. Shi, \"Gabor-type image filtering with cellular neural networks,\" Proceedings of the \n1996 IEEE International Symposium on Circuits and Systems, vol. 3, pp. 558-561, May \n1996. \n\nW. M. Theimer and H. A Mallot, \"Phase-based binocular vergence control and depth \nreconstruction using active vision,\" CVGIP: Image Understanding, vol. 60, no. 3, pp. \n343-358, Nov. 1994. \n\nC.-J. Weste1ius, H. Knutsson, J. Wiklund and c.-F. Westin, \"Phase-based disparity estima(cid:173)\ntion,\" in 1. L. Crowley and H. I. Christensen, eds., Vision as Process, chap. 11, pp. 157-\n178, Springer-Verlag, Berlin, 1995. \n\n\f", "award": [], "sourceid": 1364, "authors": [{"given_name": "Bertram", "family_name": "Shi", "institution": null}, {"given_name": "Kwok", "family_name": "Hui", "institution": null}]}