{"title": "An Analog Neural Network Inspired by Fractal Block Coding", "book": "Advances in Neural Information Processing Systems", "page_first": 795, "page_last": 802, "abstract": null, "full_text": "An Analog Neural Network Inspired by \n\nFractal Block Coding \n\nFernando J. Pineda \nThe Applied Physics Laboratory \nThe Johns Hopkins University \nJohns Hokins Road \nLaurel, MD 20723-6099 \n\nAndreas G. Andreou \nDept. of Electrical & Computer \nEngineering \nThe Johns Hopkins University \n34th & Charles St. \nBaltimore, MD 21218 \n\nAbstract \n\nWe consider the problem of decoding block coded data, using a physical \ndynamical system. We sketch out a decompression algorithm for fractal \nblock codes and then show how to implement a recurrent neural \nnetwork using physically simple but highly-nonlinear, analog circuit \nmodels of neurons and synapses. The nonlinear system has many fixed \npoints, but we have at our disposal a procedure to choose the parameters \nin such a way that only one solution, the desired solution, is stable. As \na partial proof of the concept, we present experimental data from a \nsmall system a 16-neuron analog CMOS chip fabricated in a 2m analog \np-well process. This chip operates in the subthreshold regime and, for \neach choice of parameters, converges to a unique stable state. Each state \nexhibits a qualitatively fractal shape. \n\n1. INTRODUCTION \n\nSometimes, a nonlinear approach is the simplest way to solve a linear problem. This is \ntrue when computing with physical dynamical systems whose natural operations are \nnonlinear. In such cases it may be expensive, in terms of physical complexity, to \nlinearize the dynamics. For example in neural computation active ion channels have \nhighly non linear input-output behaviour (see Hille 1984). Another example is \n\n\f796 \n\nFernando Pineda. Andreas G. Andreou \n\nsubthreshold CMOS VLSI technology 1. In both examples the physics that governs the \noperation of the active devices, gives rise to gain elements that have exponential transfer \ncharacteristics. These exponentials result in computing structures with non-linear \ndynamics. It is therefore worthwhile, from both scientific and engineering perspectives, to \ninvestigate the idea of analog computation by highly non-linear components. \nThis paper, explores an approach for solving a specific linear problem with analog \ncircuits that have nonlinear transfer functions. The computational task considered here is \nthat of fractal block code decompression (see e.g. Jacquin, 1989). \nThe conventional approach to decompressing fractal codes is essentially an excercise in \nsolving a high-dimenional sparse linear system of equations by using a relaxation \nalgorithm. The relaxation algorithm is performed by iteratively applying an affine \ntransformation to a state vector. The iteration yields a sequence of state vectors that \nconverges to a vector of decoded data. The approach taken in this paper is based on the \nobservation that one can construct a physically-simple nonlinear dyanmical system whose \nunique stable fixed point coincides with the solution of the sparse linear system of \nequations. \nIn the next section we briefly summarize the basic ideas behind fractal block coding. This \nis followed by a description of an analog circuit with physically-simple nonlinear \nneurons. We show how to set the input voltages for the network so that we can program \nthe position of the stable fixed point. Finally , we present experimental results obtained \nfrom a test chip fabricated in a 2mm CMOS process. \n\n2. FRACTAL BLOCK CODING IN A NUTSHELL \n\nLet the N-dimensional state vector I represent a one dimensional curve sampled on N \npoints. An affine transformation of this vector is simply a transformation of the form I' \n= WI+B , where W is an NxN -element matrix and B is an N-component vector. This \ntransformation can be iterated to produce a sequence of vectors I(O) ... . ,I(n). The sequence \nconverges to a unique final state 1* that is independent of the initial state 1(0) if the \nmaximum eigenvalue A.max of the matrix W satisfies Amax < 1. The uniqueness of the final \nstate implies that to transmit the state r to a receiver, we can either transmit r directly, \nor we can transmit Wand B and let the receiver perform the iteration to generate r. In \nthe latter case we say that Wand B constitute an encoding of the state 1*. For this \nencoding to be useful, the amount of data needed to transmit Wand B must be less than \nthe amount of data needed to transmit r This is the case when Wand B are sparse and \nparameterized and when the total number of bits needed to transmit these parameters is \nless than the total number of bits needed to transmit the uncompressed state r \nFractal block coding is a special case of the above approach. It amounts to choosing a \n\nlWe consider subthreshold analog VLSI., (Mead 1989; Andreou and Boahen, 1994). A \nsimple subthreshold model is ~ = I~nfet) exp(K'Vgb)( exp( -vsb) -exp( -Vdb\u00bb) for \nNFETS, where 1C - 0.67 and I~ t) = 9.7 x 10-18 A. The voltage differences Vgb, \n,vsb,and Vdb are in units of the thermal voltafje, Vth= 0.025V. We use a corresponding \nexpression for PFETs of the from Ids = I~pfe exp( -K'Vgb)( exp(vsb) - exp(vdb\u00bb) where \nI~Pfet) =3.8xl0-18 A. \n\n\fAn Analog Neural Network Inspired by Fractal Block Coding \n\n797 \n\nblocked structure for the matrix W. This structure forces large-scale features to be mapped \ninto small-scale features. The result is a steady state r that represents a curve with self \nsimilar (actually self affine) features. As a concrete example of such a structure, consider \nthe following transformation of the state I. \n\nfor O~ i ~ N -1 \n\n2 \n\nr i = w R12i- N + bR \n\nN \n\nfor 2: ~ i ~ N-l \n\n(1) \n\nThis transformation has two blocks. The transformation of the first N/2 components of I \ndepend on the parameters W Land b L while the transformation of the second N/2 \ncomponents depend on the parameters WR, and bR . Consequently just four parameters \ncompletely specify this transformation. This transformation can be expressed as a single \naffine transformation as follows: \n\n/' 0 \n\nwL \n\n10 \n\nb L \n\nI'N/2-1 \n\n/' N12 \n\n= \n\nWR \n\nwL 1 \nN/2-1 + \n1 \nN/2 \n\nb L \nbR \n\n(2) \n\n/' N-l \n\nwR \n\nI N - 1 \n\nbR \n\nThe top and bottom halves of I I depend on the odd and even components of I \nrespectively. This subsampling causes features of size I to be mapped into features of \nsize 112. A subsampled copy of the state I with transformed intensities is copied into the \ntop half of 1'. Similarly, a subsampled copy of the state I with transformed intensities is \ncopied into the bottom half of 1'. If this transformation is iterated, the sequence of \ntransformed vectors will converge provided the eigenvalues determined by WL and WR are \nall less than one (i.e. WL and WR < 1). \nAlthough this toy example has just four free parameters and is thus too trivial to be \nuseful for actual compression applications, it does suffice to generate state vectors with \nfractal properties since at steady state, the top and bottom halves of I' differ from the \nentire curve by an affine transformation. \nIn this paper we will not describe how to solve the inverse problem which consists of \nfinding a parameterized affine transformation that produces a given final state T. We \nnote, however, that it is a special (and simpler) case of the recurrent network training \nproblem, since the problem is linear, has no hidden units and has only one fixed point. \nThe reader is refered to (Pineda, 1988) or. for a least squares algorithm in the context of \nneural nets or to (Monroe and Dudbridge, 1992) for a least squares algorithm in the \ncontext of coding. \n\n3. A CMOS NEURAL NETWORK MODEL \nNow that we have described the salient aspects of the fractal decompression problem, we \ntum to the problem of implementing an analog neural network whose nonlinear dynamics \nconverges to the same fixed point as the linear system. Nonlinearity arises because we \n\n\f798 \n\nFernando Pineda, Andreas G. Andreou \n\nmake no special effort to linearize the gain elements (controlled conductances and \ntransconductances) of the implementation medium. In this section we first describe a \nsimple neuron. Then we analyze the dynamics of a network composed of such neurons. \nFinally we describe how to program the fixed point in the actual physical network. \n3.1 The analog Neuron \n\\\\(~t) woulgll) like to create a neuron model that calculates the transformation \nI = al + b . Consider the circuit shown in figure 1. This has three functional \nsections which compute by adding and subtracting currents and. where voltages are \"log\" \ncoded; this is the essence of the \"current-mode\" aproach in circuit design (Andreou et.al. \n1994). The first section, receives an input voltage from a presynaptic neuron, converts it \ninto a current I(in), and multiplies it by a weight a. The second section adds and subtracts \nthe bias current b. The last section converts the output current into an output voltage and \ntransmits it to the next neuron in the network. Since the transistors have exponential \ntransfer characteristics, this voltage is logarithmically coded. \nThe parameters a and b are set by external voltages. Theyarameter a, is set by a single \nexternal voltage Va while the bias parameter b = br -) - b( + is set by two external voltages \nvb(+) and vbH . Two voltages are used for b to account for both positive and negative \nbIas values since b( -\u00bb0 and b( + \u00bb0 . \n\n'{,(+) \n\nV \n\nr - - - - - - - - - , \nr - - - - - , r - - - - - I \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \n(in) I \nI \nI \nII \n: aiin~ \nI \nI \nI ~ I \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nL ________ .I \nI \nI \n\nI \nI \nI \nI \nI \nI L ______ -' \n\n: \n\n(ou \n\nII I l \n\nI \nI \nI \nI \nI \n\nI \n\nI \nI \nI \nI \n\nI \nI \n\nI~--~---+----~~r--;--~--~V \n\n(out) \n\nI \nI \nI \nI \nI \nI \nI \n\n------\nI \n\n~-) \n\nFigure 1. The analog neuron has three sections. \n\nTo derive the dynamical equations of the neuron, it is neccesary to add up all the currents \nand invoke Kirchoffs current law, which requires that \n\nlout ) _alin ) +b(+) -b(-) = Ic . \n\n(3) \n\nIf we now assume a simple subthreshold model for the behavior of the FET's and PFETs \nin the neuron, we can obtain the following expression for the current across the \ncapacitor: \n\nQ dl out) \n- - -\nlout) \n\ndt \n\n=1 \nc \n\n(4) \n\n\fAn Analog Neural Network Inspired by Fractal Block Coding \n\n799 \n\nwhere Q = Cl1cVth determines the characteristic time scale of the neuron2. It immediately \nfollows from the last two expressions that the dynamics of a single neuron is determined \nby the equation \n\ndJCout) \n\nQ \n\ndt \n\n= _/(out) (I(out) _ al(in) _ b). \n\n(5) \n\nWhere b = M-) - M+) . This equation appears to have a quadratic nonlinearity on the r.h.s. \nIn fact, the noninearity is even more complicated since, the cooeficients a, M +) and b( -) \nare not constants, but depend on I(out) (through v(out). Application of the simple \nsubthreshold model, results in a multiplier gain that is a function of v( out) (and hence \ntout)) as well as Va . It is given by \n\na( va' v0 , some of the eigenvalues will \nnecessarily be positive and the spurious solutions will be unstable. To summarize the \nabove discussion, we have shown that by choosing bi >0 and lail <1 for all i, we can \nmake the desired fixed point stable and the spurious fixed points unstable. Note that a \nsufficient condition for bi >0 is if b~ +) = O. \nIt remains to show that the system must converge to the desired fixed point, i.e. that the \nsystem cannot oscillate or wander chaotically. To do this we consider the connectivity of \nthe network we implemented in our test chip. This is shown schematically in figure 2. \nThe first eight neurons receive input from the odd numbered neurons while the second \neight neurons receive input from the even numbered neurons. The neurons on the left(cid:173)\nhand side all share the weight, WL, while the neurons on the right share the weight WR. \nBy tracing the connections, we find that there are two independent loops of neurons: \nloop #1 = {0,8,12,14,IS,7,3,1} and loop #2 = {2,9,4,1O,13,6,1l,S}. \n\nFigure 2. The connection topology for the test chip is determined by \nthe matrix of equation (1). The neurons are labeled 0-15. \n\nBy inspecting each loop, we see that it passes through either the left or right hand range \nan even number of times. Hence, if there are any inhibitory weights in a loop, there \nmust be an even number of them. This is the \"even loop criterion\", and it suffices to \nprove that the network is globally asymptotically stable, (Hirsch, 1987). \n\n3.3. Programming the fixed point \nThe nonlinear circuit of the previous section converges to a fixed point which is the \nsolution of the following system of transcendental equations \n-\n\n(-) * \n\n* \n\n* \nIi -ai(li ,va)Ij(i) -bi \n\n* \n\n(Ii ,vb<-\u00bb-O \n\n(12) \n\n\fAn Analog Neural Network Inspired by Fractal Block Coding \n\n801 \n\nwhere the coefficients ai and bi are given by equations (6) and (7b) respectively. \nSimilarly, the iterated affine transformations converge to the solution of the following \nlinear equations \n\n* \nIi - A/j(j) - B j = 0 \n\n* \n\n(13) \n\nwhere the coefficients {Ai ,Bi } and the connectionsj(i) are obtained by solving the \napproximate inverse problem with the additional constraints that bi >0 and lail <1 for all \ni,. The requirement that the fixed points of the two systems be identical results in the \nconditions \n\nAj = aj(lj ,va) \n\n* \n\nB - b(-)(I* \n\nj -\n\nj \n\nj ,V b(-) \n\n) \n\n(14) \n\nThese equations can be solved for the required input voltages Va, and vb(-). Thus we are \nable to construct a nonlinear dynamical system that converges to the same fixed point as a \nlinear system. For this programming method to work, of course, the subthreshold model \nwe have used to characterize the network must accurately model the physical properties of \nthe neural network. \n4. PRELIMINARY RESULTS \nAs a first step towards realizing a working system, we fabricated a Tiny chip containing \n16 neurons arranged in two groups of eight. The topology is the same as shown in figure \n2. The neurons are similar to those in figure 1 except that the bias term in each block of \n8 neurons has the form b = kb( -) + (7 - k )b( -) , where O::;k::;7 is the label of a particular \nneuron within a block. This form increases the complexity of the neurons, but also \nallows us to represent ramps more easily (see figure 3). \nWe fabricated the chip through MOSIS in a 2~m p-well CMOS process. A switching \nlayer allows us to change the connection topology at run-time. One of the four possible \nconfigurations corresponds to the toplogy of figure 2. Six external voltages \n,VbH ' Vi)H }parameterize the fixed points of the network. These are \n{Va ,V H ' Vi) H ' Va \nconfrolfM blpote~tioIdeters~ There is multiplexing circuitry included on the chip that \nselects which neuron output is to be amplified by a sense-amp and routed off-chip. The \nneurons can be addressed individually by a 4-bit neuron address. The addressing and \nanalog-to-digital conversion is performed by a Motorolla 68HCIIAI microprocessor. \nWe have operated the chip at 5volts and at 2.6 volts. Figure 3. shows the scanned steady \nstate output of one of the test chips for a particular choice of input parameters with v dd =5 \nvolts. The curve in figure 3. exhibits the qualitatively self-similar features of a recursively \ngenerated object. We are able to see three generations of a ramp. At 2.5 volts we see a \nvery similar curve. We find that the chip draws 16.3 ~ at 2.5 volts. This corresponds to \na steady state power dissipation of 411lW. Simulations indicate that the chip is operating \nin the subthreshold regime when Vdd = 2.5 volts. Simulations also indicate that the chip \nsettles in less than one millisecond. We are unable to perform quantitiative measurements \nwith the first chip because of several layout errors. On the other hand, we have \nexperimentally verified that the network is indeed stable and that network produces \nqualitative fractals. We explored the parameter space informatlly. At no time did we \nencounter anything but the desired solutions. \n\n\f802 \n\nFernando Pineda, Andreas G. Andreou \n\nO~--~~~------~--==:L----~----~----~-\no \n\n10 \n\n2 \n\n12 \n\n14 \n\nNeuron label \n\n4 \n\n6 \n\n8 \n\nFigure 3 D/A output for chip #3 for a particular set of input voltages. \n\nWe have already fabricated a larger design without the layout problems of the prototype. \nThis second design has 32 pixeles and a richer set of permitted topologies. We expect to \nmake quantitative measurements with this second design. In particular we hope to use it \nto decompress an actual block code. \nAcknowledgements \nThe work described here is funded by APL !R&D as well as a grant from the National \nScience Foundation ECS9313934, Paul Werbos is the monitor. The authors would like \nto thank Robert Jenkins, Kim Strohbehn and Paul Furth for many useful conversations \nand suggestions. \nReferences \nAndreou, A.G. and Boahen, K.A. Neural Information Processing I: The Current-Mode \napproach, Analog VLSI: Signal and Information Processing, (eds: M Ismail and T. Fiez) \nMacGraw-Hill Inc., New York. Chapter 6 (1994). \nHille, B., Ionic Channels of Excitable Membranes, Sunderland, MA, Sinauer Associates \nInc. (1984). \nHirsch, M. ,Convergence in Neural Nets, Proceedings of the IEEE ICNN, San Diego, \nCA, (1987). \nJacquin, A. E., A Fractal Theory of iterated Markov operators with applications to digital \nimage coding, Ph.D. Dissertation, Georgia Institute of Technology (1989). \nMead, c., Analog VLSI and Neural Systems, Addison Wesley, (1989) \nMonroe, D.M. and Dudbridge, F. Fractal block coding of images, Electronics Letters, 28, \npp. 1053-1055, (1992). \nPineda, F.J., Dynamics and Architecuture for Neural Computation, Journal of \nComplexity, 4, 216-245 (1988). \n\n\f", "award": [], "sourceid": 957, "authors": [{"given_name": "Fernando", "family_name": "Pineda", "institution": null}, {"given_name": "Andreas", "family_name": "Andreou", "institution": null}]}