{"title": "Real-Time Computer Vision and Robotics Using Analog VLSI Circuits", "book": "Advances in Neural Information Processing Systems", "page_first": 750, "page_last": 757, "abstract": null, "full_text": "750 \n\nKoch, Bair, Harris, Horiuchi, Hsu and Luo \n\nReal- Time Computer Vision and Robotics \n\nUsing Analog VLSI Circuits \n\nChristof Koch Wyeth Bair \n\nJohn G. Harris Timothy Horiuchi \n\nAndrew Hsu \n\nJin Luo \n\nComputation and Neural Systems Program \n\nCaltech 216-76 \n\nPasadena, CA 91125 \n\nABSTRACT \n\nThe long-term goal of our laboratory is the development of analog \nresistive network-based VLSI implementations of early and inter(cid:173)\nmediate vision algorithms. We demonstrate an experimental cir(cid:173)\ncuit for smoothing and segmenting noisy and sparse depth data \nusing the resistive fuse and a 1-D edge-detection circuit for com(cid:173)\nputing zero-crossings using two resistive grids with different space(cid:173)\nconstants. To demonstrate the robustness of our algorithms and \nof the fabricated analog CMOS VLSI chips, we are mounting these \ncircuits onto small mobile vehicles operating in a real-time, labo(cid:173)\nratory environment. \n\nINTRODUCTION \n\n1 \nA large number of computer vision algorithms for finding intensity edges, comput(cid:173)\ning motion, depth, and color, and recovering the 3-D shapes of objects have been \ndeveloped within the framework of minimizing an associated \"energy\" functional. \nSuch a variational formalism is attractive because it allows a priori constraints \nto be explicitly stated. The single most important constraint is that the physical \nprocesses underlying image formation, such as depth, orientation and surface re(cid:173)\nflectance, change slowly in space. For instance, the depths of neighboring points on \na surface are usually very similar. Standard regularization algorithms embody this \nsmoothness constraint and lead to quadratic variational functionals with a unique \nglobal minimum (Poggio, Torre, and Koch, 1985). These quadratic functionals \n\n\fReal-Time Computer Vision and Robotics Using Analog VLSI Circuits \n\n751 \n\nG G \n\nRl \n\nG G \n\nRl \n\nG \n\nG \n\n(a) \n\n3.1V \n\nNode \nVoltage \n\n3.0V \n\n(b) \n\nI \u2022 \n\nI \n\n\u2022 \n\n\u2022 \n\n\u2022 \n\n1 \n\n2 \n\n34 5 \nPhotoreceptor \n\n6 \n\n7 \n\n1 \nEdge \nOutput \no \n\nFigure 1: (a) shows the schematic of the zero-crossing chip. The phototransistors \nlogarithmically map light intensity to voltages that are applied via a conductance G \nonto the nodes of two linear resistive networks. The network resistances Rl and R2 \ncan be arbitrarily adjusted to achieve different space-constants. Transconductance \namplifiers compute the difference of the smoothed network node voltages and report \na current proportional to that difference. The sign of current then drives exclusive-or \ncircuitry (not shown) between each pair of neighboring pixels. The final output is \na binary signal indicating the positions of the zero-crossings. The linear network \nresistances have been implemented using Mead's saturating resistor circuit (Mead, \n1989), and the vertical resistors are implemented with transconductance followers. \n(b) shows the measured response of a seven-pixel version of the chip to a bright \nbackground with a shadow cast across the middle three photoreceptors. The circles \nand triangles show the node voltages on the resistive networks with the smaller and \nlarger space-constants, respectively. Edges are indicated by the binary output (bar \nchart at bottom) corresponding to the locations of zero-crossings. \n\n\f752 \n\nKoch, Bair, Harris, Horiuchi, Hsu and Luo \n\ncan be mapped onto linear resistive networks, such that the stationary voltage dis(cid:173)\ntribution, corresponding to the state of least power dissipation, is equivalent to \nthe solution of the variational functional (Horn, 1974; Poggio and Koch, 1985). \nSmoothness breaks down, however, at discontinuities caused by occlusions or differ(cid:173)\nences in the physical processes underlying image formation (e.g., different surface \nreflectance properties). Detecting these discontinuities becomes crucial, not only \nbecause otherwise smoothness is incorrectly applied but also because the locations \nof discontinuities are often required for further image analysis and understanding. \nWe describe two different approaches for finding discontinuities in early vision: (1) \na 1-D edge-detection circuit for computing zero-crossings using two resistive grids \nwith different space-constants, and (2) a 20 by 20 pixel circuit for smoothing and \nsegmenting noisy and sparse depth data using the resistive fuse. \n\nFinally, while successfully demonstrating a highly integrated circuit on a stationary \nlaboratory bench under controlled conditions is already a tremendous success, this \nis not the environment in which we ultimately intend them to be used. The jump \nfrom a sterile, well-controlled, and predictable environment such as that of the \nlaboratory bench to a noisy and physically demanding environment of a mobile \nrobot can often spell out the true limits of a circuit's robustness. \nIn order to \ndemonstrate the robustness and real-time performance of these circuits, we have \nmounted two such chips onto small toy vehicles. \n\n2 AN EDGE DETECTION CIRCUIT \nThe zero-crossings of the Laplacian of the Gaussian, V 2G, are often used for de(cid:173)\ntecting edges. Marr and Hildreth (1980) discovered that the Mexican-hat shape \nof the V2G operator can be approximated by the difference of two Gaussians \n(DOG). In this spirit, we have built a chip that takes the difference of two resistive(cid:173)\nnetwork smoothings of photoreceptor input and finds the resulting zero-crossings. \nThe Green's function of the resistive network, a decaying exponential, differs from \nthe Gaussian, but simulations with digitized camera images have shown that the \ndifference of exponentials (DOE) gives results nearly as good as the DOG. Further(cid:173)\nmore, resistive nets have a natural implementation in silicon, while implementing \nthe Gaussian is cumbersome. \n\nThe circuit, Figure la, uses two independent resistive networks to smooth the volt(cid:173)\nages supplied by logarithmic photoreceptors. The voltages on the two networks are \nsubtracted and exclusive-or circuitry (not shown) is used to detect zero-crossings. In \norder to facilitate thresholding of edges, an additional current is computed at each \nnode indicating the strength of the zero-crossing. This is particularly important \nfor robust real-world performance where there will be many small (in magnitude \nof slope) zero-crossings due to noise. Figure 1b shows the measured response of a \nseven-pixel version of the chip to a bright background with a shadow cast across the \nmiddle three photoreceptors. Subtracting the two network voltage traces shown at \nthe top, we find two zero-crossings, which the chip correctly identifies in the binary \noutput shown at the bottom. \n\n\fReal-Time Computer Vision and Robotics Using Analog VLSI Circuits \n\n753 \n\n~OJ 0 0 \nV- ~ ~, \n\n_ ~f;j ~ \n\nc::J-\nI I~ \nI:::j-\n\n/,:1 \n\nV \n\nJ \n\n... \n\n~I.. \n\nG- 1 ~ \n2u 2 ~ \n\\ ' -\n-It\\. dij ~ \n\nc::J-\n\n1. \n\n/ \n\n~ ~\"\"\"I \no 0 0 I \nIl \n\nHRES \n\n. / \n........ \n\nv-\n\nr-..... \n\n,-\n\n(a) \n\n--\n\n-\n--\n\n300 \nI \n\n(nA) \n\n(b) \n\nO+-__ ~ ____ -~V~T~~~ ________ ~ __ __ \n\n-30~0.5 \n\n0.0 \n\n~V (Volts) \n\n0.5 \n\nFigure 2: (a) Schematic diagram for the 20 by 20 pixel surface interpolation and \nsmoothing chip. A rectangular mesh of resistive fuse elements (shown as rectangles) \nprovide the smoothing and segmentation ability of the network. The data are given \nas battery values dij with the conductance G connecting the battery to the grid set \nto G = 1/2u2 , where u 2 is the variance of the additive Gaussian noise assumed to \ncorrupt the data. (b) Measured current-voltage relationship for different settings \nof the resistive fuse. For a voltage of less than VT across this two-terminal device, \nthe circuit acts as a resistor with conductance A. Above VT, the current is either \nabruptly set to zero (binary fuse) or smoothly goes to zero (analog fuse). We can \ncontinuously vary the I-V curve from the hyperbolic tangent of Mead's saturating \nresistor (HRES) to that of an analog fuse (Fig. 2b), effectively implementing a \ncontinuation method for minimizing the non-convex functional. The I-V curve of a \nbinary fuse is also illustrated. \n\n\f754 \n\nKoch, Bair, Harris, Horiuchi, Hsu and Luo \n\n3 A CIRCUIT FOR SMOOTHING AND SEGMENTING \nMany researchers have extended regularization theory to include discontinuities. \nLet us consider the problem of interpolating noisy and sparse 1-D data (the 2-D \ngeneralization is straightforward), where the depth data di is given on a discrete \ngrid. Associated with each lattice point is the value of the recovered surface Ii \nand a binary line discontinuity Ii. When the surface is expected to be smooth \n(with a first-order, membrane-type stabilizer) except at isolated discontinuities, the \nfunctional to be minimized is given by: \n\nJ(f, I) = A ~(fi+l - 1i)2(1 -Ii) + 2!2 ~(di - 1i)2 + a ~ Ii \n\n(1) \n\nI \n\nI \n\nI \n\nwhere (]'2 is the variance of the additive Gaussian noise process assumed to corrupt \nthe data di, and A and a are free parameters. The first term implements the \npiecewise smooth constraint: if all variables, with the exception of Ii, Ii+l, and Ii, \nare held fixed and A(fi+l - h)2 < a, it is \"cheaper\" to pay the price A(fi+l - h)2 \nand set Ii = 0 than to pay the larger price a; if the gradient becomes too steep, \nIi = 1, and the surface is segmented at that location. The second term, with the \nsum only including those locations i where data exist, forces the surface I to be \nclose to the measured data d. How close depends on the estimated magnitude of \nthe noise, in this case on (]'2. The final surface I is the one that best satisfies the \nconflicting demands of piecewise smoothness and fidelity on the measured data. \nTo minimize the 2-D generalization of eq. (1), we map the functional J onto the \ncircuit shown in Fig. 2a such that the stationary voltage at every gridpoint then \ncorresponds to hi. The cost functional J is interpreted as electrical co-content, \nthe generalization of power for nonlinear networks. We designed a two-terminal \nnonlinear device, which we call a resistive fuse, to implement piecewise smoothness \n(Fig. 2b). If the magnitude of the voltage drop across the device is less than \nVT = (a/A)1/2, the fuse acts as a linear resistor with conductance A. \nIf VT is \nexceeded, however, the fuse breaks and the current goes to zero. The operation of \nthe fuse is fully reversible. We built a 20 by 20 pixel fuse network chip and show \nits segmentation and smoothing performance in Figure 3. \n\n4 AUTONOMOUS VEHICLES \nOur goal-beyond the design and fabrication of analog resistive-network chips-is \nto build mobile testbeds for the evaluation of chips as well as to provide a systems \nperspective on the usefulness of certain vision algorithms. Due to the small size \nand power requirements of these chips, it is possible to utilize the vast resource of \ncommercially available toy vehicles. The advantages of toy cars over robotic vehicles \nbuilt for research are their low cost, ease of modification, high power-to-weight ratio, \navailability, and inherent robustness to the real-world. Accordingly, we integrated \ntwo analog resistive-network chips designed and built in Mead's laboratory onto \nsmall toy cars controlled by a digital microprocessor (see Figure 4). \n\n\fReal-Time Computer Vision and Robotics Using Analog VLSI Circuits \n\n755 \n\n(b) \n\n(c) \n\n