{"title": "A General Purpose Image Processing Chip: Orientation Detection", "book": "Advances in Neural Information Processing Systems", "page_first": 873, "page_last": 879, "abstract": "", "full_text": "Asymptotic Theory for Regularization: \n\nOne-Dimensional Linear Case \n\nRolf Nevanlinna Institute, P.O. Box 4, FIN-00014 University of Helsinki, \n\nFinland. Email: PetrLKoistinen@rnLhelsinkLfi \n\nPetri Koistinen \n\nAbstract \n\nThe generalization ability of a neural network can sometimes be \nimproved dramatically by regularization. To analyze the improve(cid:173)\nment one needs more refined results than the asymptotic distri(cid:173)\nbution of the weight vector. Here we study the simple case of \none-dimensional linear regression under quadratic regularization, \ni.e., ridge regression. We study the random design, misspecified \ncase, where we derive expansions for the optimal regularization pa(cid:173)\nrameter and the ensuing improvement. It is possible to construct \nexamples where it is best to use no regularization. \n\n1 \n\nINTRODUCTION \n\nSuppose that we have available training data (Xl, Yd, .. 0' (Xn' Yn) consisting of \npairs of vectors, and we try to predict Yi on the basis of Xi with a neural network \nwith weight vector w. One popular way of selecting w is by the criterion \n\n(1) \n\n1 n - L \u00a3(Xi' Yi, w) + >..Q(w) = min!, \n\nn \n\nI \n\nwhere the loss \u00a3(x,y,w) is, e.g., the squared error Ily - g(x,w)11 2 , the function \ng(., w) is the input/output function of the neural network, the penalty Q(w) is \na real function which takes on small values when the mapping g(o, w) is smooth \nand high values when it changes rapidly, and the regularization parameter >.. is a \nnonnegative scalar (which might depend on the training sample). We refer to the \nsetup (1) as (training with) regularization, and to the same setup with the choice \n>.. = 0 as training without regularization. Regularization has been found to be very \neffective for improving the generalization ability of a neural network especially when \nthe sample size n is of the same order of magnitude as the dimensionality of the \nparameter vector w, see, e.g., the textbooks (Bishop, 1995; Ripley, 1996). \n\n\fAsymptotic Theory for Regularization: One-Dimensional Linear Case \n\n295 \n\nIn this paper we deal with asymptotics in the case where the architecture of the \nnetwork is fixed but the sample size grows . To fix ideas, let us assume that the \ntraining data is part of an Li.d. \n(independent, identically distributed) sequence \n(X,Y);(Xl'Yl),(X2'Y2)\"\" of pairs of random vectors, i.e., for each i the pair \n(Xi, Yi) has the same distribution as the pair (X, Y) and the collection of pairs is \nindependent (X and Y can be dependent) . Then we can define the (prediction) risk \nof a network with weights w as the expected value \n(2) \nr(w) := IE:f(X, Y, w). \nLet us denote the minimizer of (1) by Wn (.),) , and a minimizer of the risk r by \nw*. The quantity r(w n (>.)) is the average prediction error for data independent of \nthe training sample. This quantity r(w n (>.)) is a random variable which describes \nthe generalization performance of the network: it is bounded below by r( w*) and \nthe more concentrated it is about r(w*), the better the performance. We will \nquantify this concentration by a single number, the expected value IE:r(wn(>.)) . We \nare interested in quantifying the gain (if any) in generalization for training with \nversus training without regularization defined by \n\n(3) \n\nWhen regularization helps, this is positive. \n\nHowever, relatively little can be said about the quantity (3) without specifying in \ndetail how the regularization parameter is determined. We show in the next section \nthat provided>' converges to zero sufficiently quickly (at the rate op(n- 1 / 2 )), then \nIE: r(wn(O)) and IE: r(wn(>.)) are equal to leading order. It turns out, that the optimal \nregularization parameter resides in this asymptotic regime. For this reason, delicate \nanalysis is required in order to get an asymptotic approximation for (3). In this \narticle we derive the needed asymptotic expansions only for the simplest possible \ncase: one-dimensional linear regression where the regularization parameter is chosen \nindependently of the training sample. \n\n2 REGULARIZATION IN LINEAR REGRESSION \n\nWe now specialize the setup (1) to the case of linear regression and a quadratic \nsmoothness penalty, i.e. , we take f(x,y,w) = [y-xTwJ2 and Q(w) = wTRw, where \nnow y is scalar, x and w are vectors, and R is a symmetric, positive definite matrix. \nIt is well known (and easy to show) that then the minimizer of (1) is \n\n(4) \n\n1 n \n\nwn (>') = ~ ~ XiX! + >'R \n\n[ \n\n]\n\n-1 \n\n1 n \n~ ~ XiYi. \n\nThis is called the generalized ridge regression estimator, see, e.g., (Titterington, \n1985); ridge regression corresponds to the choice R = I, see (Hoerl and Kennard, \n1988) for a survey. Notice that (generalized) ridge regression is usually studied in \nthe fixed design case, where Xi:s are nonrandom. Further, it is usually assumed \nthat the model is correctly specified, i.e., that there exists a parameter such that \nYi = Xr w* + \u20ac i , and such that the distribution of the noise term \u20aci does not depend \non Xi. In contrast, we study the random design, misspecified case. \nAssuming that IE: IIXI1 2 < 00 and that IE: [XXT] is invertible, the minimizer of the \nrisk (2) and the risk itself can be written as \n\n(5) \n(6) \n\nw* = A-lIE: [XY], with A:=IE:[XXT] \n\nr(w) = r(w*) + (w - w*f A(w - w*). \n\n\f296 \n\nP. Koistinen \n\nIf Zn is a sequence of random variables, then the notation Zn = open-a) means \nthat n a Zn converges to zero in probability as n -+ 00 . For this notation and the \nmathematical tools needed for the following proposition see, e.g., (Serfiing, 1980, \nCh. 1) or (Brockwell and Davis, 1987, Ch. 6). \n\nProposition 1 Suppose that IE: y4 < 00, IE: IIXII 4 < 00 and that A = IE: [X XTj is in(cid:173)\nvertible. If,\\ = op(n- I/2), then both y'n(wn(O) -w*) and y'n(wn('\\) - w*) converge \nin distribution to N (0, C), a normal distribution with mean zero and covariance \nmatrix C. \n\nThe previous proposition also generalizes to the nonlinear case (under more compli(cid:173)\ncated conditions). Given this proposition, it follows (under certain additional con(cid:173)\nditions) by Taylor expansion that both IE:r(wn('\\)) - r(w*) and IEr(wn(O)) - r(w*) \nadmit the expansion f31 n -} + o( n -}) with the same constant f3I. Hence, in the \nregime ,\\ = op(n-I/2) we need to consider higher order expansions in order to \ncompare the performance of wn(,\\) and wn(O). \n\n3 ONE-DIMENSIONAL LINEAR REGRESSION \n\nWe now specialize the setting of the previous section to the case where x is scalar. \nAlso, from now on, we only consider the case where the regularization parameter \nfor given sample size n is deterministic; especially ,\\ is not allowed to depend on \nthe training sample. This is necessary, since coefficients in the following type of \nasymptotic expansions depend on the details of how the regularization parameter \nis determined. The deterministic case is the easiest one to analyze. \nWe develop asymptotic expansions for the criterion \n\n(7) \n\nwhere now the regularization parameter k is deterministic and nonnegative. The \nexpansions we get turn out to be valid uniformly for k ~ O. We then develop \nasymptotic formulas for the minimizer of I n, and also for In(O) - inf I n. The last \nquantity can be interpreted as the average improvement in generalization perfor(cid:173)\nmance gained by optimal level of regularization, when the regularization constant \nis allowed to depend on n but not on the training sample. \nFrom now on we take Q(w) = w2 and assume that A = IEX2 = 1 (which could be \narranged by a linear change of variables). Referring back to formulas in the previous \nsection, we see that \n\n(8) \n\nr(wn(k)) - r(w*) = ern - kw*)2/(Un + 1 + k)2 =: h(Un, Vn, k), \n\nwhence In(k) = IE:h(Un, Vn , k), where we have introduced the function h (used \nheavily in what follows) as well as the arithmetic means Un and Vn \n\n(9) \n\n(10) \n\n_ \n\n1 n \n\nVn:= - L Vi, with \nVi := XiYi - w* xl \n\nn \n\nI \n\nFor convenience, also define U := X2 - 1 and V := Xy - w* X2 . Notice that \nU; UI, U2 , \u2022 .. are zero mean Li.d. random variables, and that V; Vi, V2 ,. \" satisfy \nthe same conditions. Hence Un and Vn converge to zero, and this leads to the idea \nof using the Taylor expansion of h(u, v, k) about the point (u, v) = (0,0) in order \nto get an expansion for In(k). \n\n\fAsymptotic Theory for Regularization: One-Dimensional Linear Case \n\n297 \n\nTo outline the ideas, let Tj(u,v,k) be the degree j Taylor polynomial of (u,v) f-7 \nh(u, v, k) about (0,0), i.e., Tj(u, v, k) is a polynomial in u and v whose coeffi(cid:173)\ncients are functions of k and whose degree with respect to u and v is j. Then \nIETj(Un,Vn,k) depends on n and moments of U and V. By deriving an upper \nbound for the quantity IE Ih(Un, Vn, k) - Tj(Un, Vn, k)1 we get an upper bound for \nthe error committed in approximating In(k) by IE Tj(Un, Vn, k). It turns out that \nfor odd degrees j the error is of the same order of magnitude in n as for degree \nj - 1. Therefore we only consider even degrees j. It also turns out that the error \nbounds are uniform in k ~ 0 whenever j ~ 2. To proceed, we need to introduce \nassumptions. \n\nAssumption 1 IE IXlr < 00 and IE IYls < 00 for high enough rand s. \n\nAssumption 2 Either (a) for some constant j3 > 0 almost surely IXI ;::: j3 or (b) \nX has a density which is bounded in some neighborhood of zero. \n\nAssumption 1 guarantees the existence of high enough moments; the values r = 20 \nand s = 8 are sufficient for the following proofs. E.g., if the pair (X, Y) has a \nnormal distribution or a distribution with compact support, then moments of all \norders exist and hence in this case assumption 1 would be satisfied. Without some \ncondition such as assumption 2, In(O) might fail to be meaningful or finite. The \nfollowing technical result is stated without proof. \n\nProposition 2 Let p > 0 and let 0 < IE X 2 < 00. If assumption 2 holds, then \n\nwhere the expectation on the left is finite (a) for n ~ 1 (b) for n > 2p provided that \nassumption 2 (a), respectively 2 (b) holds. \n\nProposition 3 Let assumptions 1 and 2 hold. Then there exist constants no and \nM such that \n\nIn(k) = JET2(Un, Vn, k) + R(n, k) where \n\n_ _ \n\n(w*)2k2 \n\n-1 [IEV2 \n\n(w*)2k2JEU2 W*kIEUV] \n\nIET2(Un, Vn, k) = (1+k)2 +n \n\n(1+k)2 +3 \n\n(1+k)4 +4 (1+k)3 \n\nIR(n, k) I :s; Mn- 3/2(k + 1)-1, \n\n\"In;::: no, k ;::: o. \n\nPROOF SKETCH The formula for IE T2(Un , Vn. k) follows easily by integrating the \ndegree two Taylor polynomial term by term. To get the upper bound for R(n, k), \nconsider the residual \n\nwhere we have omitted four similar terms. Using the bound \n\n\f298 \n\nP. Koistinen \n\nthe Ll triangle inequality, and the Cauchy-Schwartz inequality, we get \n\nIR(n, k)1 = IJE [h(Un, Vn, k) - T2(Un, Vn, k)]1 \n\n., (k+ W' {Ii: [(~ ~Xl)-'] r \n\n{2(k + 1)3[JE (lUnI2IVnI 4 )]l/2 + 4(w*)2k2(k + 1)[18 IUnI6]l/2 ... } \n\nBy proposition 2, here 18 [(~ 2:~ X[)-4] = 0(1). Next we use the following fact, cf. \n(Serfiing, 1980, Lemma B, p. 68). \nFact 1 Let {Zd be i.i.d. with 18 [Zd = 0 and with 18 IZI/v < 00 for some v ~ 2. \nThen \n\nv \n\nApplying the Cauchy-Schwartz inequality and this fact, we get, e.g., that \n[18 (IUnI2 IVnI 4 )]l/2 ~ [(18 IUnI4 )1/2(E IVnI8)1/2p/2 = 0(n- 3/ 2). \n\nGoing through all the terms carefully, we see that the bound holds. \n\no \n\nProposition 4 Let assumptions 1 and 2 hold, assume that w* :j; 0, and set \n\nal := (18 V2 - 2w*E [UVD/(w*)2. \n\nIf al > 0, then there exists a constant ni such that for all n ~ nl the function \nk ~ ET2(Un, Vn,k) has a unique minimum on [0,(0) at the point k~ admitting the \nexpanszon \n\nIn(O) - inf{Jn(k) : k ~ O} = In(O) - In(aln- 1 ) = ar(w*)2n- 2 + 0(n- 5 / 2). \n\nk~ = aIn- 1 + 0(n-2); \n\nfurther, \n\nIf a ~ 0, then \n\nPROOF SKETCH The proof is based on perturbation expansio!1 c~nsidering lin a \nsmall parameter. By the previous proposition, Sn(k) := ET2 (Un , Vn , k) is the sum \nof (w*)2k2/(1 + k)2 and a term whose supremum over k ~ ko > -1 goes to zero \nas n ~ 00. Here the first term has a unique minimum on (-1,00) at k = O. \nDifferentiating Sn we get \n\nS~(k) = [2(w*)2k(k + 1)2 + n- 1p2(k)]/(k + 1)5, \n\nwhere P2(k) is a second degree polynomial in k. The numerator polynomial has \nthree roots, one of which converges to zero as n ~ 00. A regular perturbation \nexpansion for this root, k~ = aln- I + a2n-2 + ... , yields the stated formula for \nal. This point is a minimum for all sufficiently large n; further, it is greater than \nzero for all sufficiently large n if and only if al > O. \nThe estimate for J n (0) - inf { J n (k) : k ~ O} in the case al > 0 follows by noticing \nthat \n\nIn(O) - In(k) = 18 [h(Un, Vn, 0) - h(Un, Vn, k)), \n\nwhere we now use a third degree Taylor expansion about (u, v, k) = (0,0,0) \n\nh(u,v,O) - h(u,v,k) = \n\n2w* kv - (w*)2k2 - 4w*kuv + 2(w*?k2u + 2kv2 - 4w*k2v + 2(W*)2k3 + r(u, v, k). \n\n\fAsymptotic Theory for Regularization: One-Dimensional Linear Case \n\n299 \n\n0.2 \n0.18 \n0.16 \n0.14 \n0.12 \n0.1 ~~ __ ~ __ ~ __ ~ __ ~ __ ~ __ ~ __ L-~ __ ~ \n0.5 \n\n0.35 \n\n0.05 \n\n0.15 \n\n0.2 \n\n0.4 \n\n0.45 \n\no \n\n0.1 \n\n0.25 \n\n0.3 \n\nFigure 1: Illustration of the asymptotic approximations in the situation of equation \n(11) . Horizontal axis kj vertical axis .In(l\u00a3) and its asymptotic appr~ximations. \nLegend: markers In(k); solid line IE T2 (Un, Vn, k)j dashed line IET4 (Un, Vn, k). \n\nUsin~ t~e techniques of the previous proposition, \nIE Ir(Un , Vn , k~)1 = O(n- S/ 2 ). \nestimate gives \n\nit can be shown that \nIntegrating the Taylor polynomial and using this \n\nIn(O) - In(aI/n) = af(w*)2n-2 + O(n- S/ 2 ). \n\nFinally, by the mean value theorem, \n\nIn(O) -inf{ In(k) : k ~ O} = In(O) -In(aI/n) + ! (In(O) - In(k)]lk=8(k~ -aI/n) \n= In(O) - In(aI/n) + O(n-1)O(n-2) \nwhere () lies between k~ and aI/n, and where we have used the fact that the indi(cid:173)\ncated derivative evaluated at () is of order O(n- 1 ), as can be shown with moderate \n0 \neffort. \n\nRemark In the preceding we assumed that A = IEX 2 equals 1. If this is not \nthe case, then the formula for a1 has to be divided by A; again, if a1 > 0, then \nk~ = a1n-1 + O(n- 2 ) . \nIf the model is correctly specified in the sense that Y = w* X + E, where E is \nindependent of X and IE E = 0, then V = X E and IE [UV] = O. Hence we have \na1 = IE [E2]j(w*)2, and this is strictly positive expect in the degenerate case where \nE = 0 with probability one. This means that here regularization helps provided the \nregularization parameter is chosen around the value aI/n and n is large enough. \nSee Figure 1 for an illustration in the case \nX \"'\" N(O, 1) , Y = w* X + f , \n\nf \"'\" N(O, 1), w* = 1, \n\n(11) \n\nwhere E and X are independent. In(k) is estimated on the basis of 1000 repetitions \nof the task for n = 8. In addition to IE T2(Un, Vn, k) the function IE T4 (Un, lin, k) \nis also plotted. The latter can be shown to give In(k) correctly up to order \nO(n-s/ 2 (k+ 1)-3). Notice that although IE T2(Un, Vn, k) does not give that good an \napproximation for In(k), its minimizer is near the minimizer of In(k), and both of \nthese minimizers lie near the point al/n = 0.125 as predicted by the theory. In the \nsituation (11) it can actually be shown by lengthy calculations that the minimizer \nof In(k) is exactly al/n for each sample size n ~ 1. \nIt is possible to construct cases where a1 < O. For instance, take \n\nX \"'\" Uniform (a, b), \nY = cjX + d+ Z, \n\na=- b=-(3Vs-l) \n\n1 \n2 ' \n\n1 \n4 \nc= -5,d= 8, \n\n\f300 \n\nP. Koistinen \n\nand Z '\" N (0, a 2 ) with Z and X independent and 0 :::; a < 1.1. In such a case \nregularization using a positive regularization parameter only makes matters worse; \nusing a properly chosen negative regularization parameter would, however, help in \nthis particular case. This would, however, amount to rewarding rapidly changing \nfunctions. In the case (11) regularization using a negative value for the regulariza(cid:173)\ntion parameter would be catastrophic. \n\n4 DISCUSSION \n\nWe have obtained asymptotic approximations for the optimal regularization param(cid:173)\neter in (1) and the amount of improvement (3) in the simple case of one-dimensional \nlinear regression when the regularization parameter is chosen independently of the \ntraining sample. It turned out that the optimal regularization parameter is, to \nleading order, given by Qln- 1 and the resulting improvement is of order O(n- 2 ). \nWe have also seen that if Ql < 0 then regularization only makes matters worse. \nAlso (Larsen and Hansen, 1994) have obtained asymptotic results for the optimal \nregularization parameter in (1). They consider the case of a nonlinear network; \nhowever, they assume that the neural network model is correctly specified. \nThe generalization of the present results to the nonlinear, misspecified case might \nbe possible using, e.g., techniques from (Bhattacharya and Ghosh, 1978). General(cid:173)\nization to the case where the regularization parameter is chosen on the basis of the \nsample (say, by cross validation) would be desirable. \n\nAcknowledgements \n\nThis paper was prepared while the author was visiting the Department for Statis(cid:173)\ntics and Probability Theory at the Vienna University of Technology with financial \nsupport from the Academy of Finland. I thank F. Leisch for useful discussions. \n\nReferences \nBhattacharya, R. N. and Ghosh, J. K. (1978). On the validity of the formal Edge(cid:173)\n\nworth expansion. The Annals of Statistics, 6(2):434-45l. \n\nBishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford University \n\nPress. \n\nBrockwell, P. J. and Davis, R. A. (1987). Time Series: Theory and Methods. \n\nSpringer series in statistics. Springer-Verlag. \n\nHoerl, A. E. and Kennard, R. W. (1988). Ridge regression. In Kotz, S., Johnson, \nN. L., and Read, C. B., editors, Encyclopedia of Statistical Sciences. John Wiley \n& Sons, Inc. \n\nLarsen, J. and Hansen, L. K. (1994). Generalization performance of regularized \nneural network models. In Vlontos, J., Whang, J.-N., and Wilson, E., editors, \nProc. of the 4th IEEE Workshop on Neural Networks for Signal Processing, \npages 42-51. IEEE Press. \n\nRipley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge Uni(cid:173)\n\nversity Press. \n\nSerfiing, R. J. (1980). Approximation Theorems of Mathematical Statistics. John \n\nWiley & Sons, Inc. \n\nTitterington, D. M. (1985). Common structure of smoothing techniques in statistics. \n\nInternational Statistical Review, 53:141-170. \n\n\fA General Purpose Image Processing Chip: \n\nOrientation Detection \n\nRalph Etienne-Cummings and Donghui Cai \n\nDepartment of Electrical Engineering \n\nSouthern Illinois University \nCarbondale, IL 6290 1-6603 \n\nAbstract \n\nA 80 x 78 pixel general purpose vision chip for spatial focal plane \nprocessing is presented. The size and configuration of the processing \nreceptive field are programmable. The chip's architecture allows the \nphotoreceptor cells to be small and densely packed by performing all \ncomputation on the read-out, away from the array. In addition to the \nraw intensity image, the chip outputs four processed images in parallel. \nAlso presented is an application of the chip to line segment orientation \ndetection, as found in the retinal receptive fields of toads. \n\n1 INTRODUCTION \nThe front-end of the biological vision system is the retina, which is a layered structure \nresponsible for image acquisition and pre-processing. The early processing is used to \nextract spatiotemporal information which helps perception and survival. \nThis is \naccomplished with cells having feature detecting receptive fields, such as the edge \ndetecting center-surround spatial receptive fields of the primate and cat bipolar cells \n[Spillmann, 1990]. \nIn toads, the receptive fields of the retinal cells are even more \nspecialized for survival by detecting ''prey'' and \"predator\" (from size and orientation \nfilters) at this very early stage [Spi11mann, 1990]. \n\nThe receptive of the retinal cells performs a convolution with the incident image in \nparallel and continuous time. This has inspired many engineers to develop retinomorphic \nvision systems which also imitate these parallel processing capabilities [Mead, 1989; \nCamp, 1994]. While this approach is ideal for fast early processing, it is not space \nefficient. That is, in realizing the receptive field within each pixel, considerable die area \nis required to implement the convolution kernel. In addition, should programmability be \nrequired, the complexity of each pixel increases drastically. The space constraints are \neliminated if the processing is performed serially during read-out. The benefits of this \napproach are 1) each pixel can be as small as possible to allow high resolution imaging, \n2) a single processor unit is used for the entire retina thus reducing mis-match problems, \n3) programmability can be obtained with no impact on the density of imaging array, and \n\n\f874 \n\nR. Etienne-Cummings and D. Cai \n\n4) compact general purpose focal plane visual processing is realizable. The space \nconstrains are then transfonned into temporal restrictions since the scanning clock speed \nand response time of the processing circuits must scale with the size of the array. \nDividing the array into sub-arrays which are scanned in parallel can help this problem. \nClearly this approach departs from the architecture of its biological counterpart, however, \nthis method capitalizes on the main advantage of silicon which is its speed. This is an \nexample of mixed signal neuromorphic engineering, where biological ideas are mapped \nonto silicon not using direct imitation (which has been the preferred approach in the past) \nbut rather by realizing their essence with the best silicon architecture and computational \ncircuits. \n\nThis paper presents a general purpose vision chip for spatial focal plane processing. Its \narchitecture allows the photoreceptor cells to be small and densely packed by performing \nall computation on the read-out, away from the array. Performing computation during \nread-out is ideal for silicon implementation since no additional temporal over-head is \nrequired, provided that the processing circuits are fast enough. The chip uses a single \nconvolution kernel, per parallel sub-array, and the scanning bit pattern to realize various \nreceptive fields. This is different from other focal plane image processors which am \nusually restricted to hardwired convolution kernels, such as oriented 20 Gabor filters \n[Camp, 1994]. \nIn addition to the raw intensity image, the chip outputs four processed \nversions per sub-array. Also presented is an application of the chip to line segment \norientation detection, as found in the retinal receptive fields of toads [Spillmann, 1990]. \n2 THE GENERAL PURPOSE IMAGE PROCESSING CHIP \n2.1 \n\nSystem Overview \n\nThis chip has an 80 row by 78 column photocell array partitioned into four independent \nsub-arrays, which are scanned and output in parallel, (see figure I). Each block is 40 row \nby 39 column, and has its own convolution kernel and output circuit. The scanning \ncircuit includes three parts: virtual ground, control signal generator (CSG), and scanning \noutput transformer. Each block has its own virtual ground and scanning output \ntransformer in both x direction (horizontal) and y direction (vertical). The control signal \ngenerator is shared among blocks. \n\n2.2 Hardware Implementation \n\nThe phototransistor performance light transduction, while \n\nThe photocell is composed of phototransistor, photo current amplifier, and output \ncontrol. \nthe amplifier \nmagnifies the photocurrent by three orders of magnitude. The output control provides \nmultiple copies of the amplified photocun-ent which is subsequently used for focal plane \nimage processing. \n\nThe phototransistor is a parasitic PNP transistor in an Nwell CMOS process. The \ncurrent amplifier uses a pair of diode connected pmosfets to obtain a logarithmic \nrelationship between light intensity and output current. This circuit also amplifies the \nphotocurrent from nanoamperes to microamperes. The photocell sends three copies of the \noutput currents into three independent buses. The connections from the photocell to the \nbuses are controlled by pass transistors, as shown in Fig. 2. The three current outputs \nallow the image to be processed using mUltiple receptive field organization (convolution \nkernels), while the raw image is also output. The row (column) buses provides currents \nfor extracting horizontally (vertically) oriented image features, while the original bus \nprovides the logarithmically compressed intensity image. \n\nThe scanning circuit addresses the photocell array by selecting groups of cells at one time. \nSince the output of the cells are currents, virtual ground circuits are used on each bus to \nmask the> I pF capacitance of the buses. The CSG, implemented with shift registers \n\n\fA General Purpose Image Processing Chip: Orientation Detection \n\n875 \n\n\" \n\n\" \n\n\" \n\nt I -=_\"ODat~tl'g*< ~;;:-i :':'< t :1 \n*\",1,,\",,1, ,',,1,><,.,,', \n\n= -t ~I .. _ .... \" \" \" \" ' -\n\n1 \n\nV \n\n~ . .....,. ..... ..,. \n\nlloc',I: .... , ..-. \n\nlIIIoc~' l: Drl, ...... \n\n..... . . . \" . \"'!J\"rty \n\n, I \n:, \n\n- -t.\". ciii-. \n-\n-\n\n...f~;;;i \n\n~~\".... \n\n~'!\"' I \n\n! m \n\n! \n1 ~ \n1 \n\nf W \n\nV \n\n1I ..... ,2orf. _ \n\ne69r>0 . . . o-Y ......... \n\n1IIocIf,.\u00b7\"\"._ \n\n..... , 1Id,..,.Mpwy \n\nA \n\n[!J \n\nI \nI ~ \nI \n1 \n\u2022 \n\n{ w \n\n..\u2022..\u2022.\u2022\n\u2022\u2022 \nc .... jir;i \noA\"\"; -\n\n, \nI_ \n\nI \n\n-\n\nd\"' ~ ;;; \" \" \n\n~~ -\n\n~ ~ \n\nFigure 1: Block diagram of the chip. \n\nproduces signals which select photocells and control the scanning output transformer. \nThe scanning output transformer converts currents from all row buses into Ipe\u00ab and Icenx' \nand converts currents from all row buses into lpery and Iccny. This transformation is \nrequired to implement the various convolution kernels discussed later. \n\nThe output transformer circuits are controlled by a central CSG and a peripheral CSG. \nThese two generators have identical structures but different initial values. It consists of \nan n-bit shift register in x direction (horizontally) and an m-bit shift register in y direction \n(vertically). A feedback circuit is used to restore the scanning pattern into the x shift \nregister after each row scan is completed. This is repeated until all the row in each block \nare scanned. \n\nThe control signals from the peripheral and central CSGs select all the cells covered by a \n2D convolution mask (receptive field). The selected cells send Ixy to the original bus, Ixp \nto the row bus, and Iyp to the column bus. The function of the scanning output \ntransformer is to identify which rows (columns) are considered as the center (Icenx or Ircny) \nor periphery (Irerx or Ipcry) of the convolution kernel, respectively. Figure 3 shows how a \n3x3 convolution kernel can be constructed. \n\nFigure 4 shows how the output transformer works for a 3x3 mask. Only row bus \ntransformation is shown in this example, but the same mechanism applies to the column \nbus as well. The photocell array is m row by n column, and the size is 3x3. The XC (x \ncenter) and YC (y center) come from the central CSG; while XP (x peripheral) and YP (y \nperipheral) come from the peripheral CSG. After loading the CSG, the initial values of \nXP and YP are both 00011...1. The initial values of XC and YC are both 10 111.. .1. \nThis identifies the central cell as location (2, 2). The currents from the central row \n(column) are summed to form Iren\u2022 and leeny, while all the peripheral cells are summed to \nform Iperx and lpery. This is achieved by activating the switches labeled XC, YC, XP and \nYP in figure 2. XPj (YP,) {i= I, 2, ... , n} controls whether the output current of one cell \n\n\f876 \n\nR. Etienne-Cummings and D. Cai \n\nYC~ IO----+----yp \nXC---<::A. \n\n,.b---~p \n\nJyp \n\nhp \n\nOriginal Bus-- ledgey' \nIsmllllth,ledge2d). The kernel (receptive field) size is programmable from 3x3, 5x5, 7x7, 9x9 \nand 11 x 11 . Fig. 5 shows the 3x3 masks for this processing. Repeating the above steps \nfor 5x5, 7x7, 9x9, and II x 11 masks, we can get similar results. \n\np.. u \n>-\n>-\n\nc:s:l \n\n--\n\nc:s:l \n\nc:s:l \n\nc:s:l \n\n..... \n\n--\n--\n\nYPI \nVCI \n\nYl'2 \nYC2 \n\nYP3 \nYC3 \n\nYPIoI \nYCN \n\nFigure 4: Scanning output transformer for an m row by n column photo cell array. \n\n\fA General Purpose Image Processing Chip: Orientation Detection \n\n877 \n\n1 \n\n1 \n\n1 \n\nI \n\n1 \n\n1 \n\n1 \n\n1 \n\n1 \n\n(a) smooth \n\n-I \n\n2 \n\n-I \n\n-I \n\n2 \n\n-I \n\n-I \n\n2 \n\n-1 \n\n(b) edge_x \n\n- 1 \n\n-I \n\n-I \n\n2 \n\n2 \n\n2 \n\n-I \n\n-I \n\n-I \n\n(c) edge-y \n\n0 \n\n-I \n\n0 \n\n- 1 \n\n4 \n\n-I \n\n0 \n\n-I \n\n0 \n\n(d) edge_2D \n\nFigure 5: 3x3 convolution masks for various image processing. \n\nIedge.=Kld * Icen. -Ipc\", \n\nIn general, the convolution results under different mask sizes can be expressed as follows: \nI~mooth=Icen. + Ire... \nIedge2D=K2d *I\"ri-Icen.-Iceny \nWhere Kid and K2d are the programmable coefficients (from 2-6 and 6-14, respectively) for \nID edge extraction and 2D edge extraction, respectively. By varying the locations of the \nO's in the scanning circuits, different types of receptive fields (convolution kernels) can be \nrealized. \n\nIedgey=Kld * Iceny-Ipcry \n\n2.3 Results \n\nThe chip contains 65K transistors in a footprint of 4.6 mm x 4.7 mm. There are 80 x 78 \nphotocells in the chip, each of which is 45.6 11m x 45 !lm and a fill factor of 15%. The \nconvolution kernel occupies 690.6 !lm x 102.6 11m. The power consumption of the chip \nfor a 3x3 (1\\ x 11) receptive field, indoor light, and 5V power supply is < 2 m W (8 m W). \n\nTo capitalize on the programmability of this chip, an ND card in a Pentium 133MHz PC \nis used to load the scanning circuit and to collect data. The card, which has a maximum \nanalog throughput of 100KHz limits the frame rate of the chip to 12 frames per second. \nAt this rate, five processed versions of the image is collected and displayed. The scanning \nand processing circuits can operate at 10 MHz (6250 fps), however, the phototransistors \nhave much slower dynamics. Temporal smoothing (smear) can be observed on the scope \nwhen the frame rate exceeds 100 fps. \n\nThe chip displays a logarithmic relationship between light intensity and output current \n(unprocessed imaged) from 0.1 lux (100 nA) to 6000 lux (10 IlA). The fixed pattern \nnoise, defined as standard-deviation/mean, decreases abruptly from 25% in the dark to 2% \nat room light (800 lux). This behavior is expected since the variation of individual pixel \ncurrent is large compared to the mean output when the mean is small. The logarithmic \nresponse of the photocell results in high sensitivity at low light, thus increasing the \nmean value sharply. Little variation is observed between chips. \n\nThe contrast sensitivity of the edge detection masks is also measured for the 3x3 and 5x5 \nreceptive fields. Here contrast is defined as (1m .. - Imin)/(Im .. + Imin) and sensitivity is given \nas a percentage of the maximum output. The measurements are performed for normal \nroom and bright lighting conditions. Since the two conditions corresponded to the \nsaturated part of the logarithmic transfer function of the photocells, then a linear \nrelationship between output response and contrast is expected. Figure 6 shows contrast \nsensitivity plot. Figure 7 shows examples of chip's outputs. The top two images are the \nraw and smoothed (5x5) images. The bottom two are the 1 D edge_x (left) and 2D edge \n(right) images. The pixels with positive values have been thresholded to white. The \nvertical black line in the image is not visible in the edge_x image, but can be clearly seen \nin the edge_2D image. \n\n\fR. Etienne-Cummings and D. Cai \n\n878 \n\n80 \n\n>< \n... o \n~ 60 \n\n~ \n:; 40 \n& \n::I o \n\n20 \n\n..... \u00b7\u00b75,5 Bri!,hl \n-e-5xS Normal \n... \u2022 \n\u00b7\u00b7 3,3 Bri!,hl \n_ _ 3,3 Nonnal \n\nContrast [%] \n\nFigure 6: Contrast sensitivity function of \nthe x edge detection mask. \n3 APPLICATION: ORIENTATION DETECTION \n3.1 Algorithm Overview \n\nFigure 7: (Clockwise) Raw image. 5x5 \nsmoothed image. edge_2D and edge_x. \n\nThis vision chip can be elegantly used to measure the orientation of line segments which \nfall across the receptive field of each pixel. The output of the 10 Laplacian operators, \nedge_x and edge_y, shown in figure 5, can be used to detennine the orientation of edge \nsegments. Consider a continuous line through the origin, represented by a delta function \nin 20 space by IX y-xtan()). If the origin is the center of the receptive field. the response \nofthe edge_x kernel can be computed by evaluating the convolution equation (1). where \nW(x) = u(x+m)-u(x-m) is the x window over which smoothing is performed, 2m+ J is the \nwidth of the window and 2n+ J is the number of coefficients realizing the discrete \nLaplacian operator. In our case, n = m. Evaluating this equation and substituting the \norigin for the pixel location yields equation (2), which indicates that the output of the 10 \nedge_x (edge-y) detectors have a discretized linear relationship to orientation from on to \n45\" (45\u00b0 to 90\u00b0). At 0\", the second term in equation (2) is zero. As e increase, more \nterms are subtracted until all tenns are subtracted at 45\u00b0. Above 45 0 (below 45\u00b0), the \nedge_x (edge-y) detectors output zero since equal numbers of positive and negative \ncoefficients are summed. Provided that contrast can be nonnalized. the output of the \ndetectors can be used to extract the orientation of the line. Clearly these responses are \neven about the x- and y-axis. respectively. Hence, a second pair of edge detectors. oriented \nat 45\", is required to uniquely extract the angle of the line segment. \n\n10 \n\n8 \n\n~ 6 \n..=. \n:; \ns-\n:::I \n0 \n\n4 \n\n0 \n\n, \n\n0..,. \n\nb~ \n\n-o._~ \n\no: \n\n\"I \n\n--370 Lux \n\n- 9- -260 Lux \n\n- -. - - (R!) Lux \n\n-\u00b7\u00b7 .. -\u00b7 - (25 Lux \n\n2 \n\n~ \u2022 \u2022 . \u2022 \u2022.\u2022 \u2022\n\n. _. . \n\no \n\no \n\n' .\n\n. . . . \n\n20 \n\n, , \n'b \n'0, ,b \n-. ... .... ... ~ .~ ..... :-. \n\nCI' ... ' \n\n\u2022\n\n\u2022 .,,;:: . :-:-' . -\n\n- \u2022\u2022 ~ . ': T -0- - _ _ _ \n\n40 \nAngle ['-'] \n\n60 \n\n80 \n\nFigure 8: Measured orientation transfer function of edge_x detectors. \n\n\fA General Purpose Image Processing Chip: Orientation Detection. \n\n879 \n\n0edge_Ax,y) = [2nW(x \u00b1m)8(y)-...EW(x \u00b1m)8(y\u00b1i)]*8(y-xtane) \n\n(I) \n\nn \n\nOedge AO.O) = 2n-[ ~(W(--)+ W(--)] \n\n(2) \n\n-\n\nn \u00b7 \n~ I \n\n;=} \n\ntane \n\n;=} \n\n. \n-I \nfane \n\n3.2 Results \n\nFigure 8 shows the measured output of the edge_x detectors for various lighting \nconditions as a line is rotated. The average positive outputs are plotted. As expected, the \noutput is maximum for bright ambients when the line is horizontal. As the line is \nrotated, the output current decreases linearly and levels off at approximately 45\". On the \nother hand, the edge_y (not shown) begins its linear increase at 45\" and maximizes at 90\u00b0. \nAfter normalizing for brightness. the four curves are very similar (not shown). \n\nTo further demonstrate orientation detection with this chip, a character consisting of a \ncircle and some straight lines is presented. The intensity image of the character is shown \nin figure 9(a). Figures 9(b) and 9(c) show the outputs of the edge_x and edge-y \ndetectors, respectively. Since a 7x7 receptive field is used in this experiment, some outer \npixels of each block are lost. The orientation selectivity of the 1 D edge detectors are \nclearly visible in the figures, where edge_x highlights horizontal edges and edge_y vertical \nedges. Figure 9(d) shows the reported angles. A program is written which takes the two \nI D edge images, finds the location of the edges from the edge_2D image, the intensity at \nthe edges (positive lobe) and then computes the angle of the edge segment. In figure 9(d), \nthe black background is chosen for locations where no edges are detected, white is used for \n0\u00b0 and gray for 90\u00b0. \n\n(a) \n\n(b) \n\n(c) \n\n(d) \n\nFigure 9: Orientation detection using ID Laplacian Operators. \n\n4 CONCLUSION \nA 80x78 pixel general purpose vision chip for spatial focal plane processing has been \npresented. The size and configuration of the processing receptive field are programmable. \nIn addition to the raw intensity image, the chip outputs four processed images in parallel. \nThe chip has been successfully used for compact line segment orientation detection, \nwhich can be used in character recognition. The programmability and relatively low \npower consumption makes it ideal for many visual processing tasks. \n\nReferences \n\nCamp W. and J. Van cler Spiegel, \"A Silicon VLSI Optical Sensor for Pattern \n\nRecognition, \" Sensors and Actuators A, Vol. 43, No. 1-3, pp. 188-195, 1994. \n\nMead C. and M. Ismail (Eds.), Analog VLSI Implementation of Neural Networks, \n\nKluwer Academic Press, Newell, MA, 1989. \n\nSpi11mann L. and J. Werner (Eds.), Visual Perception: The Neurophysiological \n\nFoundations, Academic Press, San Diego, CA, 1990. \n\n\f", "award": [], "sourceid": 1455, "authors": [{"given_name": "Ralph", "family_name": "Etienne-Cummings", "institution": null}, {"given_name": "Donghui", "family_name": "Cai", "institution": null}]}*