{"title": "Phasor Neural Networks", "book": "Neural Information Processing Systems", "page_first": 584, "page_last": 591, "abstract": null, "full_text": "584 \n\nPHASOR NEURAL NETVORKS \n\nAndr~ J. Noest, N.I.B.R., NL-ll0S AZ Amsterdam, The Netherlands. \n\nABSTRACT \n\nA novel network type is introduced which uses unit-length 2-vectors \n\nfor local variables. As an example of its applications, associative \nmemory nets are defined and their performance analyzed. Real systems \ncorresponding to such 'phasor' models can be e.g. (neuro)biological \nnetworks of limit-cycle oscillators or optical resonators that have \na hologram in their feedback path. \n\nINTRODUCTION \n\nMost neural network models use either binary local variables or \n\nscalars combined with sigmoidal nonlinearities. Rather awkward coding \nschemes have to be invoked if one wants to maintain linear relations \nbetween the local signals being processed in e.g. associative memory \nnetworks, since the nonlinearities necessary for any nontrivial \ncomputation act directly on the range of values assumed by the local \nvariables. In addition, there is the problem of representing signals \nthat take values from a space with a different topology, e.g. that \nof the circle, sphere, torus, etc. Practical examples of such a \nsignal are the orientations of edges or the directions of local optic \nflow in images, or ~he phase of a set of (sound or EM) waves as they \narrive on an array of detectors. Apart from the fact that 'circular' \nsignals occur in technical as well as biological systems, there are \nindications that some parts of the brain (e.g. olfactory bulb, cf. \nDr.B.Baird's contribution to these proceedings) can use limit-cycle \noscillators formed by local feedback circuits as functional building \nblocks, even for signals without circular symmetry. Vith respect to \ntechnical implementations, I had speculated before the conference \nwhether it could be useful to code information in the phase of the \nbeams of optical neurocomputers, avoiding slow optical switching \nelements and using only (saturating) optical amplification and a \n\n\u00a9 American Institute of Physics 1988 \n\n\f585 \n\nhologram encoding the (complex) 'synaptic' weight factors. At the \nconference, I learnt that Prof. Dana Anderson had independently \ndeveloped an optical device (cf. these proceedings) that basically \nworks this way, at least in the slow-evolution limit of the dynamic \nhologram. Hopefully, some of the theory that I present here can be \napplied to his experiment. In turn, such implementations call for \ninteresting extensions of the present models. \n\nBASIC ELEMENTS OF GENERAL PHASOR NETVORKS \n\nHere I study the perhaps simplest non-scalar network by using unit(cid:173)\n\nlength 2-vectors (phasors) as continuous local variables. The signals \nprocessed by the network are represented in the relative phaseangles. \nThus, the nonlinearities (unit-length 'clipping') act orthogonally to \nthe range of the variables coding the information. The behavior of \nthe network is invariant under any rigid rotation of the complete set \nof phasors, representing an arbitrary choice of a global reference \nphase. Statistical physicists will recognize the phasor model as a \ngeneralization of 02-spin models to include vector-valued couplings. \n\nI \n\n1 \n\n1 \n\n1J \n\nAll 2-vectors are treated algebraically as complex numbers, writing \nIxl for the length, Ixl for the phase-angle, and x for the complex \nconjugate of a 2-vector x. \nA phasor network then consists of N\u00bbl phasors s. , with Is.l=l, \ninteracting via couplings c .. , with C .. = O. The c .. are allowed \nto be complex-valued quantities. For optical implementations this \nis clearly a natural choice, but it may seem less so for biological \nsystems. However, if the coupling between two limitcycle oscillators \nwith frequency f is mediated via a path having propagationdelay d, \nthen that coupling in fact acquires a phaseshift of f.d.2~ radians. \nThus, complex couplings can represent such systems more faithfully \nthan the usual models which neglect propagationdelays altogether. \nOnly 2-point couplings are treated here, but multi-point couplings \nc. 'k' etc., can be treated similarly. \n\n11 \n\n1J \n\n1) \n\nThe dynamics of each phasor depends only on its local field \n\nh.= ! ~ c .. s. + n. \n1 z:4- 1J J \n\n1 \n\nJ \n\nwhere z is the number of inputs \n\n\f586 \n\nc .. ~O per cell and n. is a local noise term (complex and Gaussian). \n1J \nVarious dynamics are possible, and yield largely similar results: \nContinuous-time, parallel evolution: \n\n(\"type A\") \n\n1 \n\nd (/s./) = Ih. l.sin(/h.1 - Is./) \n(IT \n\n1 \n\n1 \n\n1 \n\n1 \ns.(t+dt)= h.1 Ih. I , either serially in \n111 \n\nDiscrete-time updating: \nrandom i-sequence (\"type B\"), or in parallel for all i (\"type C\"). \nThe natural time scale for type-B dynamics is obtained by scaling \nthe discrete time-interval eft as ,.., liN ; type-C dynamics has cl't=l. \n\nLYAPUNOV FUNCTION \n\n(alias \"ENERGY\", or \"HAMILTONIAN\" ) \n\n1 \n\nIf one limits the attention temporarily to purely deterministic \n(n.=O) models, then the question suggests itself whether a class of \ncouplings exists for which one can easily find a Lyapunov function \ni.e. a function of the network variables that is monotonic under the \n1 \ndynamics. A well-known example \nscalar Hopfield models with symmetric interactions. It turns out that \na very similar function exists for phasor networks with type-A or B \ndynamics and a Hermitian matrix of couplings. \n(lIz) L 5. c .. s. \n\nis the 'energy' of the binary and \n\n-H = L 5. h. = \n\n\u2022 \n1 \n\n1 \n\n1 \n\n\u2022 \n. \n1,J \n\n1 \n\n1J \n\nJ \n\nHermiticity (c .. =c .. ) makes H real-valued and non-increasing in time. \nThis can be shown as follows, e.g. for the serial dynamics (type B). \nSuppose, without loss of generality, that phasor i=l is updated. \n\n1J \n\nJ 1 \n\nThen \n\n-z H = \n\n+ Ls. c ' l sl \n\n1>1 1 \n\n1 \n\n+ I. I: \ni ,j>l \n\n-s. c .. s. \n\n1J \n\nJ \n\n1 \n\n1 \n\ni>l 1 \n\nsl' 2: c ' 1 5. + constant. \nH becomes real-valued, and one also has \nl:c1 \u00b7 \ni>l \n\n-\ns. \n1 \n\nz h1 \n\n= \n\n1 \n\nz 51 h1 + \n\nVith Hermitian couplings, \n\nI:c' l 5. \ni>1 1 \n1 \n\n= 2 Re(sl h1) . \n\nThus, - H - constant = 51 h1 + sl h1 \nClearly, H is minimized with respect to sl by sl(t+1) = hll I h11 \u2022 \nType-A dynamics has the same Lyapunovian, but type C is more complex. \nThe existence of Hermitian interactions and the corresponding energy \nfunction simplifies greatly the understanding and design of phasor \nnetworks, although non-Hermitian networks can still have a Lyapunov-\n\n\f587 \n\nfunction, and even networks for which such a function is not readily \nfound can be useful, as will be illustrated later. \n\nAN APPLICATION: ASSOCIATIVE MEMORY. \n\nA large class of collective computations, such as optimisations \n\nand content-addressable memory, can be realised with networks having \nan energy function. The basic idea is to define the relevant penalty \nfunction over the solution-space in the form of the generic 'energy' \nof the net, and simply let the network relax to minima of this energy. \nAs a simple example, consider an associative memory built within the \nframework of Hermitian phasor networks. \n\nIn order to store a set of patterns in the network, i.e. to make \na set of special states (at least approximatively) into attractive \nfixed points of the dynamics, one needs to choose an appropriate \nset of couplings. One particularly simple way of doing this is via \nthe phasor-analog of \"Hebb's rule\" \n\n(note the Hermiticity) \n\nrp \n\nk \n\nJ \n\nc .. = \nIJ \n\nThe rule is understood to apply only to the input-sets 'i of each i. \n\nIS p asor 1 In earne pattern \n\ns(.k). -s(.k), h \n1 \n\n(k). \nwere s. \n1 \n\n. . I \n\nh \n\nd \n\nk \n. \n\nSuch couplings should be realisable as holograms in optical networks, \nbut they may seem unrealistic in the context of biological networks \nof oscillators since the phase-shift (e.g. corresponding to a delay) \nof a connection may not be changeable at will. However, the required \ncoupling can still be implemented naturally if e.g. a few paths with \ndifferent fixed delays exist between pairs of cells. The synaps in \neach path then simply becomes the projection of the complex coupling \non the direction given by the phase of its path, i.e. it is just a \nclassical Hebb-synapse that computes the correlation of its pre- and \npost-synaptic (imposed) signals, which now are phase-shifted versions \nof the phasors s~~)The required complex c .. are then realised as the \nvector sum over at least two signals arriving via distinct paths with \ncorresponding phase-shift and real-valued synaps. Two paths suffice \nif they have orthogonal phase-shifts, but random phases will do as \nwell if there are a reasonable number of paths. \n\nIJ \n\n1 \n\nVe need to have a concise way of expressing how \n\n'near' any state \nof the net is to one or more of the stored patterns. A natural way \n\n\f588 \n\nof doing this is via a set of p order parameters called \"overlaps\" \n\n1 \nN \n\n11: s .. s. \n1 \n\n1 \n\n-(k) \nI \n\nN \n\u2022 \n1 \n\nNote the constraint on the p overlaps \n\n; 1 < k < p \u2022 \n\n-\n\n-\n\nP 2 \nI Mk ~ 1 if all the patterns \nk \n\nare orthogonal, or merely random in the limit N-.QO. This will be \nassumed from now on. Also, one sees at once that the whole behaviour \nof the network does not depend on any rigid rotation of all phasors \nover some angle since H, Mk , c .. and the dynamics are invariant under \nmultiplication of all s. by a fixed phasor : s~ = S.s. with ISI=1. \nI I I \n\n1J \n\nLet us find the performance at low loading: N,p,z .. oo, with p/z .. O \n\nand zero local noise. Also assume an initial overlap m)O with only \none pattern, say with k=1. Then the local field is \n\n1 \nz \n\nhi \n\n= - ~s .. \nhP~ 1 s~1~ I: sP~s. \nJ \n\njl'i J \n\nj' i J k \n\n-\nZ \n\n1 \n\n1 \n\nf s~k~ s~k) \n\n1 \n\nJ \n\nh(1) \n+ \n\ni \n\nh7 \n1 \n\n, \n\nwhere \n\n(1) \n= m1 . si \u2022 S \n\n+ O(1//Z) with S~f(i);ISI=1, \n\nand \n\n* \nh. = \n\n1 \n\n~ fs~k). L: s~k~s. \nz k=2 1 \nJ \n\nj(~i J \n\nO( ./( p-l) Iz') \n\n. \n\nThus, perfect recall (M1=1) occurs in one 'pass' at loadings p/z ... O. \n\nEXACTLY SOLVABLE CASE: \n\nSPARSE and ASYMMETRIC couplings \n\nAlthough it would be interesting to develop the full thermodynamics \nof Hermitian phasor networks with p and z of order N (analogous to the \nanalysis of the finite-T Hopfield model by the teams of Amit 2 and van \nHemmen3), I will analyse here instead a model with sparse, asymmetric \nconnectivity, which has the great advantages of being exactly solvable \nwith relative ease, and of being arguably more realistic biologically \nand more easily scalable technologically. In neurobiological networks \na cell has up to z;104 asymmetric connections, whereas N;101~ This \nprobably has the same reason as applies to most VLSI chips, namely to \nalleviate wiring problems. For my present purposes, the theoretical \nadvantage of getting some exact results is of primary interest 4 \n\nSuppose each cell has z incoming connections from randomly selected \n\nother cells. The state of each cell at time t depends on at most zt \ncells at time t=O. Thus, If z \u00abN \n\nand N large, then the respective \n\n112 \n\nt \n\n. \n\n\f589 \n\n4 \n\nx \n\ntrees of 'ancestors' of any pair cells have no cells in common. In \nparticular, if z_ (logN) , for any finite x, then there are no common \nancestors for any finite time t in the limit N-.OO. For fundamental \ninformation-theoretic reasons, one can hope to be able to store p \npatterns with p at most of order z for any sort of 2-point couplings. \nImportant questions to be settled are: Yhat are the accuracy and \nspeed of the recall process, and how large are the basins of the \nattractors representing recalled patterns? \nTake again initial conditions (t=O) with, say, m(t)= Hl > H>l = O. \nAllowing again local random Gaussian (complex) noise n., the local \nf \u00b7 ld b \nIe \nAs in the previous section, the h~l)term consists of the 'signal' \nm(t).s. (modulo the rigid rotation S) and a random term of variance \nat most liz. For p _ z, the h. term becomes important. Being sums of \nz(p-1) phasors oriented randomly relative to the signal, the h. are \nindependent Gaussian zero-mean 2-vectors with variance (p-1)/z , as \np,z and N .. oo . Finally, let the local noises n. have variance r2. \nThen the distribution of the s.(t+l) phasors can be found in terms of \nthe signal met) and the total variance a=(p/z)+r of the random h.+n .\u2022 \n1 \nAfter somewhat tedious algebraic manipulations (to be reported in \ndetail elsewhere) one obtains the dynamic behaviour of met) \n\nf \namI Iar notatIon, \n\necome, In now \n\nh(l) h* 1 \n\n. + n .\u2022 \n1 \n\n1 \n\n. + \n1 \n\nh \n.= \n1 \n\n* \n\n1 \n\n1 \n\n2\n\n* \n\n1 \n\n'1' \n\n* \n\n1 \n\ns \n\n1 \n\n. \n\n1\n\n. \n\n1 \n\nm(t+1) = F(m(t),a) \n\nfor discrete parallel (type-C) dynamics, \n\nfor type-A or type-B dynamics , \n\nand \n\nd met) = F(m(t),a) - met) \nTt \n\nwhere the function F(m,a) = \n\nm \n\n+\" \nIdx.(1+cos2x).expl-(m.sinx) la].(l+erfl(m.cosx)/~) \n-1'C \n\n2 \n\nThe attractive fixed points H (a)= F(H ,a) represent the retrieval \naccuracy when the loading-pIus-noise factor equals a. See figure 1. \n\n* \n\nFor a\u00abl one obtains the expansion 1-H (a) = a/4 + 3a 132 + O(a ). \n\n3 \n\nThe recall solutions vanish continuously as H _(a -a) \n\n2 \n\n112 \n\n* \n\nc \n\nat a =tc/4. \n\nc \n\nOne also obtains (at any t) the distribution of the phase scatter of \nthe phasors around the ideal values occurring in the stored pattern. \n\n* \n\n* \n\n\f590 \n\nwhere \n\nP(/u./) = (1/2n).exp(-m2/a).(1+I1t.L.exp(L2).(1+erf(L\u00bb \n\n, \n\n-(k) \n\nu.= s. s. \n111 \n\n(modulo S). \n\n1 \nL = (m/la).cos(/u./) , and \n\n1 \n\nUseful approximations for the high, respectively low M regimes are: \n\nM \u00bbra: PUu./) \n\n1 \n\n(MIl'a1l).exp[-(M./u./)2 /a ] \n\n1 \n\n; \n\nI/u./1 \u00ab\"XI2 \n\n1 \n\nM \u00ab f i : PUu./) = (1I21t).(1+L \u2022 ./;l) \n\n1 \n\nFigure ~ \nRETRIEVAL-ERROR and BASIN OF ATTRACTION versus LOADING + NOISE. \n\nQ \nQ \n\nQ en \nQ \n\nQ \n\n.,; \n\nQ \" .,; \n\nI: Q \nUI \n..,) Q \nC -0 \na. \n1:) \nCD x -\n\n.. \n\nc-\n\nQ \n\nQ \n\nQ \n\nQ \n\nQ '\" Q \n\n0 \nQ \n\n0 -Q \n\n0 c: \n\n\"'0.00 \n\n0.10 \n\n0.20 \n\nO. 30 \n\n0 \u2022 40 \n\n0 \u2022 50 \n\n0 \u2022 60 \n\nO. 70 \n\n0 \u2022 80 \n\n0 \u2022 90 \n\n1. 00 \n\na = p/z + r-r \n\n\f591 \n\nDISCUSSION \n\nIt has been shown that the usual binary or scalar neural networks \n\ncan be generalized to phasor networks, and that the general structure \nof the theoretical analysis for their use as associative memories can \nbe extended accordingly. This suggests that many of the other useful \napplications of neural nets (back-prop, etcJ can also be generalized \nto a phasor setting. This may be of interest both from the point of \nview of solving problems naturally posed in such a setting, as well \nas from that of enabling a wider range of physical implementations, \nsuch as networks of limit-cycle oscillators, phase-encoded optics, \nor maybe even Josephson-junctions. \nThe performance of phasor networks turns out to be roughly similar \nto that of the scalar systems; the maximum capacity p/z=~/4 for \nphasor nets is slightly larger than its value 2/n for binary nets, \nbut there is a seemingly faster growth of the recall error 1-M at \nfor binary nets). \nsmall a (linear for phasors, against exp(-1/(2a\u00bb \nHowever, the latter measures cannot be compared directly since they \nstem from quite different order parameters. If one reduces recalled \nphasor patterns to binary information, performance is again similar. \nFinally, the present methods and results suggest several roads to \nfurther generalizations, some of which may be relevant with respect \nto natural or technical implementations. The first class of these \ninvolves local variables ranging over the k-sphere with k>l. The \nother generalizations involve breaking the O(n) (here n=2) symmetry \nof the system, either by forcing the variables to discrete positions \non the circle (k-sphere), and/or by taking the interactions between \ntwo variables to be a more general function of the angular distance \nbetween them. Such models are now under development. \n\nREFERENCES \n\n1. J.J.Hopfield, Proc.Nat.Acad.Sci.USA 79, 2554 (1982) and \n\nidem, Proc.Nat.Acad.Sci.USA 81, 3088 (1984). \n\n2. D.J.Amit, H.Gutfreund and H.Sompolinski, Ann.Phys. 173, 30 (1987). \n3. D.Grensing, R.Kuhn and J.L. van Hemmen, J.Phys.A 20, 2935 (1987). \n4. B.Derrida, E.Gardner and A.Zippelius, Europhys.Lett. 4, 167 (1987) \n\n\f", "award": [], "sourceid": 90, "authors": [{"given_name": "Andr\u00e9", "family_name": "Noest", "institution": null}]}