{"title": "REFLEXIVE ASSOCIATIVE MEMORIES", "book": "Neural Information Processing Systems", "page_first": 495, "page_last": 504, "abstract": "", "full_text": "495 \n\nREFLEXIVE ASSOCIATIVE MEMORIES \n\nLaguna Research Laboratory, Fallbrook, CA 92028-9765 \n\nHendrlcus G. Loos \n\nABSTRACT \n\nIn the synchronous discrete model, the average memory capacity of \nbidirectional associative memories (BAMs) is compared with that of \nHopfield memories, by means of a calculat10n of the percentage of good \nrecall for 100 random BAMs of dimension 64x64, for different numbers \nof stored vectors. The memory capac1ty Is found to be much smal1er than \nthe Kosko upper bound, which Is the lesser of the two dimensions of the \nBAM. On the average, a 64x64 BAM has about 68 % of the capacity of the \ncorresponding Hopfield memory with the same number of neurons. Ortho(cid:173)\nnormal coding of the BAM Increases the effective storage capaCity by \nonly 25 %. The memory capacity limitations are due to spurious stable \nstates, which arise In BAMs In much the same way as in Hopfleld \nmemories. Occurrence of spurious stable states can be avoided by \nreplacing the thresholding in the backlayer of the BAM by another \nnonl1near process, here called \"Dominant Label Selection\" (DLS). The \nsimplest DLS is the wlnner-take-all net, which gives a fault-sensitive \nmemory. Fault tolerance can be improved by the use of an orthogonal or \nunitary transformation. An optical application of the latter is a Fourier \ntransform, which is implemented simply by a lens. \n\nI NTRODUCT ION \n\nA reflexive associative memory, also called bidirectional associa(cid:173)\n\ntive memory, is a two-layer neural net with bidirectional connections \nbetween the layers. This architecture is implied by Dana Anderson's \noptical resonator 1, and by similar configurations2,3. Bart KoSk04 coined \nthe name \"Bidirectional Associative Memory\" (BAM), and Investigated \nseveral basic propertles4- 6. We are here concerned with the memory \ncapac1ty of the BAM, with the relation between BAMs and Hopfleld \nmemories7, and with certain variations on the BAM. \n\n\u00a9 American Institute of Physics 1988 \n\n\f496 \n\nBAM STRUCTURE \n\nWe will use the discrete model In which the state of a layer of \nneurons Is described by a bipolar vector. The Dirac notationS will be \nused, In which I> and * are each other transposes, Is a scalar product, and la>, and a back layer \nof P neurons with state vector \nIb>. The bidirectional connec-\ntlons between the layers allow \nsignal flow In two directions. \nThe front stroke gives \ns(Blf\u00bb, where B 15 the connec-\ntlon matrix, and s( ) Is a thres-\nhold function, operating at \n\nfrOnt1ay~r. 'N ~eurons forward \nstroke \n\nFig. 1. BAM structure \n\n1 1 \n\nback \nstroke \n\nback layer. P neurons \n\nstate vector b \n\nstate vector f \n\nIb>= \n\nzero. The back stroke results 1n an u~graded front state =s(B Ib> >. where \nthe superscr1pt T \ndenotes transpos1t10n. We consider the synchronous model. where all \nneurons of a layer are updated s1multaneously. but the front and back \nlayers are UPdated at d1fferent t1mes. The BAM act10n 1s shown 1n F1g. 2. \nThe forward stroke entalls takIng scalar products between a front \nstate vector If> and the rows or B, and enter1ng the thresholded results \nas elements of the back state vector Ib>. In the back stroke we take \nv ~ ~hreShOlding \n\nthreshold ing \nf & reflection \n\nlID \nNxP \n\nthreshold ing \n& reflection \n\nb \n\nFIg. 2. BAM act 10n \n\n4J feedback \n& \nNxN \n\nV \n\nFtg. 3. Autoassoc1at1ve \nmemory act10n \n\nscalar products of Ib> w1th column vectors of B, and enter the \nthresholded results as elements of an upgraded state vector 1('>. \nIn \ncontrast, the act10n of an autoassoc1at1ve memory 1s shown 1n F1gure 3. \nThe BAM may also be described as an autoassoc1at1ve memory5 by \n\n\ff \nzero [IDT \n\n. b'----\"\" \n----!' \nf thresholding \n& feedback \n\nconcatenating the front and back vectors tnto a s1ngle state vector \nIv>=lf,b>,and by taking the (N+P)x(N+P) connection matrtx as shown in F1g. \n4. This autoassoclat1ve memory has the same number of neurons as our \nBAM, viz. N+P. The BAM operat1on where \ninitially only the front state 1s speci-\nf1ed may be obtained with the corres-\nponding autoassoc1ative memory by \ninitially spectfying Ib> as zero, and by \narranging the threshold1ng operat1on \nsuch that s(O) does not alter the state \nvector component. For a Hopfteld \n\nFig. 4. BAM as autoasso-\nctative memory \n\nlID zero \n\nb \n\nmemory 7 the connection matrix 1s \nM \n\nH=( I 1m> , m= 1 to M, are stored vectors, and I is the tdentity matr1x. \nWriting the N+P d1mens1onal vectors 1m> as concatenations Idm,cm>, (1) \ntakes the form \n\nM \n\nH-( I (ldm>-MI, \n\nm=l \n\nm=l \n\n(3) \n\n(4) \n\nwhere the I are identities in appropriate subspaces, the Hopfield matrix \nH may be partitioned as shown in Fig. 5. K is just the BAM matrix given \nby Kosko5, and previously used by Kohonen9 for linear heteroassoclatjve \nmemories. Comparison of Figs. 4 and 5 shows that in the synchronous \ndiscrete model the BAM with connection matrix (3) is equivalent to a \nHopfield memory in which the diagonal blocks Hd and Hc have been \n\n\f498 \n\ndeleted. Since the Hopfleld memory \nis robust~ this \"prun1ng\" may not \naffect much the associative recall of stored vectors~ if M is small; \nhowever~ on the average~ pruning will not improve the memory capaclty. \nIt follows that, on the average~ a discrete synchronous BAM with matrix \n(3) can at best have the capacity of a Hopfleld memory with the same \nnumber of neurons. \n\nWe have performed computations of the average memory capacity \nfor 64x64 BAMs and for corresponding 128x 128 Hopfleld memories. \nMonte Carlo calculations were done for 100 memories) each of which \nstores M random bipolar vectors. The straight recall of all these vectors \nwas checked) al10wtng for 24 Iterations. For the BAMs) the iterations \nwere started with a forward stroke in which one of the stored vectors \nIdm> was used as input. The percentage of good recall and its standard \ndeviation were calculated. The results plotted in Fig. 6 show that the \nsquare BAM has about 68~ of the capacity of the corresponding Hopfleld \nmemory. Although the total number of neurons is the same) the BAM only \nneeds 1/4 of the number of connections of the Hopfield memory. The \nstorage capacity found Is much smaller than the Kosko 6 upper bound) \nwhich Is min (N)P). \n\nJR[= \n\nFig. 5. Partitioned \nHopfield matrix \n\n30 \n\n40 \n\n50 \n\n10 \n\n20 \n\nM. number of stored vectors \nFig. 6. ~ of good recall versus M \n\n60 \n\nCODED BAM \n\nSo far) we have considered both front and back states to be used for \ndata. There is another use of the BAM in which only front states are used \nas data) and the back states are seen as providing a code) label, or \npOinter for the front state. Such use was antiCipated in our expression \n(3) for the BAM matrix which stores data vectors Idm> and their labels or \ncodes lem>. For a square BAM. such an arrangement cuts the Information \ncontained in a single stored data vector jn half. However, the freedom of \n\n\f499 \n\nchoosing the labels fCm> may perhaps be put to good use. Part of the \nproblem of spurious stable states l which plagues BAMs as well as \nHopf1eld memories as they are loaded up, \nis due to the lack of \northogonality of the stored vectors. In the coded BAM we have the \nopportunity to remove part of this problem by choosing the labels as \northonorma1. Such labels have been used previously by Kohonen9 1n linear \nheteroassociative memories. The question whether memory capacity can \nbe Improved In this manner was explored by taking 64x64 BAt1s In which \nthe labels are chosen as Hadamard vectors. The latter are bipolar vectors \nwith Euclidean norm ,.fp, which form an orthonormal set. These vectors \nare rows of a PxP Hadamard matrix; for a discussion see Harwtt and \nSloane 1 0. The storage capacity of such Hadamard-coded BAMs was \ncalculated as function of the number M of stored vectors for 100 cases \nfor each value of M, in the manner discussed before. The percentage of \ngood recall and its standard deviation are shown 1n Fig. 6. It Is seen that \nthe Hadamard coding gives about a factor 2.5 in M, compared to the \nordinary 64x64 BAM. However, the coded BAM has only half the stored \ndata vector dimension. Accounting for this factor 2 reduction of data \nvector dimension, the effective storage capacity advantage obtained by \nHadamard coding comes to only 25 ~. \n\nHALF BAt1 WITH HADAMARD CODING \n\nFor the coded BAM there is the option of deleting the threshold \noperation In the front layer. The resulting architecture may be called \n\"half BAt1\". In the half BAM, thresholding Is only done on the labels, and \nconsequently, the data may be taken as analog vectors. Although such an \narrangement diminishes the robustness of the memory somewhat, there \nare applications of interest. We have calculated the percentage of good \nrecall for 1 00 cases, and found that giving up the data thresholding cuts \nthe storage capacity of the Hadamard-coded BAt1 by about 60 %. \n\nSELECTIVE REFLEXIVE MEMORY \n\nThe memory capacity limitations shown in Fig. 6 are due to the \n\noccurence of spurious states when the memories are loaded up. \n\nConsider a discrete BAM with stored data vectors 1m>, m= 1 to M, \n\northonormal labels Icm>, and the connection matrix \n\n\f500 \n\n(5) \n\nFor an input data vector Iv> which is closest to the stored data vector \n11 >, one has 1n the forward stroke \n\nIb>=s(clc 1 >+ L amlcm\u00bb , \n\nM \n\n(6) \n\n(7) \n\nwhere \n\nm=2 \n\nc=< llv> \u2022 \n\nand am= \n\nAlthough for m# 1 am \n\nM \n\nm=2 \n\nmay accumulate to such a large value as to affect the thresholded result \nIb>. The problem would be avoided jf the thresholding operation s( ) in the \nback layer of the BAM were to be replaced by another nonl1near operation \nwhich selects, from the I inear combination \n\nM \n\nclc 1 >+ L amlcm> \n\nm=2 \n\n(8) \n\nthe dominant label Ic 1 >. The hypothetical device which performs this \noperation is here called the \"Dominant Label Selector\" (DLS) 11, and we \ncall the resulting memory architecture \"Selective Reflexive Memory\" \n(SRM). With the back state selected as the dominant label Ic 1 >, the back \nstroke gives . It follows 11 that the SRM g1ves perfect assoc1attve recall of the \nnearest stored data vector, for any number of vectors stored. Of course, \nthe llnear independence of the P-dimensionallabel vectors Icm>, m= 1 to \nM, requires P>=M. \n\nThe DLS must select, from a linear combination of orthonormal \nlabels, the dominant label. A trivial case is obtained by choosing the \n\n\f501 \n\nf \n\nlabels Icm> as basis vectors Ium>, which have all components zero except \nfor the mth component, which 1s unity. With this choice of labels, the \n\nb \n\ntake-all \nnet \n\n~winner\u00ad\n\nDLS may be taken as a winner(cid:173)\ntake-all net W, as shown in Fig. 7. \nThis case appears to be Included in \nAdapt Ive Resonance Theory \n(ART) 12 as a special sjmpllf1ed \ncase. A relationship between \nthe ordinary BAM and ART was \npOinted out by KoskoS. As in ART, \nthere Is cons1derable fault sensitivity tn this memory, because the \nstored data vectors appear in the connectton matrix as rows. \n\nFlg.7. Simplest reflexive \n\nmemory with DLS \n\nA memory with better fault tolerance may be obtained by using \northogonal labels other than basis vectors. The DLS can then be taken as \nan orthogonal transformation 6 followed by a winner-take-an net, as \nshown 1n Fig. 8. 6 is to be chosen such that 1t transforms the labels Icm> \n\nf \n\n1[ \n\nI \n1 \n\n,.0 \n\nrthogonal \ntransfor-\nmation \n\n/' take-all \n\nwinner(cid:173)\nnet \n\n(G \ni \nl \n\nu \n\nF1g. 8. Select1ve reflex1ve \n\nmemory \n\ntnto vectors proportional to the \nbasts vectors um>. This can always \nbe done by tak1ng \n\n1 \n\np \n\nG= [Iup> , p= 1 to P, form a \ncomplete orthonormal set which \n\ncontains the labels Icm>, m=l to M. The neurons in the DLS serve as \ngrandmother cells. Once a single winning cell has been activated, I.e., \nthe state of the layer Is a single basis vector, say \nlu I ) J this vector \nmust be passed back, after appllcation of the transformation G- 1, such \nas to produce the label IC1> at the back of the BAM. Since G 1s \northogonal. we have 6- 1 =6 T, so \nreQu1red 1nverse \ntransformation may be accompl1shed sfmply by sending the bas1s vector \nback through the transformer; this gives \n\nthat \n\nthe \n\nP \n\n*__ \ncan influence the front state If>. \nThis may perhaps be achieved by \narranging the W network to have a \nthresholding and feedback which \nare fast compared with that of the \nK network. An alternate method \nmay be to equip the W network \nw1th an output gate which is \nopened only after the W net has \nsett led. These arrangements \n\npresent a compUcatlon and cause a delay, which in some appllcations \nmay be 1nappropriate, and In others may be acceptable in a trade \nbetween speed and memory density. \n\nFor the SRM wtth output transformer and orthonormal1abels other \n\nfb , w ~eedback \n\n(OJ \n\nI[ \n(Q) \n\n[T \n\n(OJ \n\n(OJ \n\n(G \n\n(GT \nWI \n\nf \nb \nW \n\nthresholded \n\nlinear \n\nthresholded \n+ output gate \n\nFig. 11. Autoassoc1at1ve memory \nequivalent to SRM with transform \n\noutput gate \n\nwr ~ winner-take-all \n.......... Woutput \n:t@ \n\nb back layer, \n\nlinear \n\n'--___ -' f front layer \nII = BAM connections \n@ =orthogonal transformat i on \nW! ~ winner-take-all net \n\nFig. 12. Structure of SRM \n\nthan basis vectors, a correspon(cid:173)\nding autoassoclat1ve memory may \nbe composed as shown In Fig.l1. \nAn output gate in the w layer is \nchosen as the device which \nprevents the backstroke through \nthe BAM to take place before the \nw1nner-take-al net has settled. \nThe same effect may perhaps be \nachieved by choosing different \nresponse times for the neuron \nlayers f and w. These matters \nrequire investigation. Unless \nthe output transform G 1s already \nrequired for other reasons, as in \nsome optical resonators, the DLS \nwith output transform is clumsy. \nI t would far better to combine \nthe transformer G and the net W \ninto a single network. To find \nsuch a DLS should be considered \na cha 11 enge. \n\n\f504 \n\nThe wort< was partly supported by the Defense Advanced Research \nprojects Agency, ARPA order -5916, through Contract DAAHOI-86-C \n-0968 with the U.S. Army Missile Command. \n\nREFERENCES \n\n1. D. Z. Anderson, \"Coherent optical eigenstate memory\", Opt. Lett. 11, \n56 (1986). \n2. B. H. Soffer, G. J. Dunning, Y. Owechko, and E. Marom, \"Associative \nholographic memory with feedback using phase-conjugate mirrors\", Opt. \nLett. II, 1 18 ( 1986). \n3. A. Yarrtv and S. K. Wong, \"Assoctat ive memories based on message(cid:173)\nbearing optical modes In phase-conjugate resonators\", Opt. Lett. 11, \n186 (1986). \n4. B. Kosko, \"Adaptive Cognitive ProceSSing\", NSF Workshop for Neural \nNetworks and Neuromorphlc Systems, Boston, Mass., Oct. &-8, 1986. \n5. B. KOSKO, \"Bidirectional Associative Memories\", IEEE Trans. SMC, In \npress, 1987. \n6. B. KOSKO, \"Adaptive Bidirectional Associative Memories\", Appl. Opt., \n1n press, 1987. \n7. J. J. Hopfleld, \"Neural networks and physical systems with emergent \ncollective computational ablJ1tles\", Proc. NatJ. Acad. Sct. USA 79, 2554 \n( 1982). \n8. P. A. M. Dirac, THE PRINCI PLES OF QUANTLt1 MECHANICS, Oxford, 1958. \n9. T. Kohonen, \"Correlation Matrix Memories\", HelsinsKi University of \nTechnology Report TKK-F-A 130, 1970. \n10. M. Harwit and N. J. A Sloane, HADAMARD TRANSFORM OPTICS, \nAcademic Press, New York, 1979. \n11. H. G. Loos, It Adaptive Stochastic Content-Addressable Memory\", Final \nReport, ARPA Order 5916, Contract DAAHO 1-86-C-0968, March 1987. \n12. G. A. Carpenter and S. Grossberg, \"A Massively Parallel Architecture \nfor a Self-Organizing Neural Pattern Recognition Machine\", Computer \nVision, Graphics, and Image processing, 37, 54 (1987). \n13. R. D. TeKolste and C. C. Guest, \"Optical Cohen-Grossberg System \nwith Ali-Optical FeedbaCK\", IEEE First Annual International Conference \non Neural Networks, San Diego, June 21-24, 1987. \n\n\f", "award": [], "sourceid": 51, "authors": [{"given_name": "Hendricus", "family_name": "Loos", "institution": null}]}__