{"title": "Analysis of Linsker's Simulations of Hebbian Rules", "book": "Advances in Neural Information Processing Systems", "page_first": 694, "page_last": 701, "abstract": null, "full_text": "694  MacKay and Miller \n\nAnalysis  of Linsker's  Simulations \n\nof Hebbian rules \n\nDavid J.  C.  MacKay \n\nComputation and  Neural Systems \n\nCaltech 164-30 CNS \nPasadena, CA 91125 \n\nKenneth D.  Miller \n\nDepartment of Physiology \n\nUniversity of California \n\nSan Francisco, CA 94143  - 0444 \n\nmackayOaurel.cns.caltech.edu \n\nkenOphyb.ucsf.edu \n\nABSTRACT \n\nLinsker has reported the development of centre---surround receptive \nfields  and  oriented  receptive  fields  in  simulations  of a  Hebb-type \nequation  in  a  linear  network.  The  dynamics  of the  learning  rule \nare analysed in  terms of the eigenvectors of the covariance matrix \nof cell activities.  Analytic and  computational results  for  Linsker's \ncovariance matrices,  and some general theorems,  lead  to an expla(cid:173)\nnation  of the  emergence  of centre---surround  and  certain  oriented \nstructures. \n\nLinsker  [Linsker,  1986,  Linsker,  1988]  has  studied  by  simulation  the  evolution  of \nweight vectors under a  Hebb-type teacherless learning rule  in a  feed-forward  linear \nnetwork.  The equation for  the evolution of the weight vector w  of a  single neuron, \nderived  by  ensemble  averaging  the  Hebbian  rule  over  the  statistics  of the  input \npatterns,  is:! \n\na at Wi = k! + L(Qij + k 2 )wj  subject to -Wmax  ~ Wi < Wmax \n\n(1) \n\nj \n\nlOur definition of equation  I  differs  from  Linsker's by the omission of a  factor  of liN before \n\nthe sum term,  where N  is  the number of synapses. \n\n\fAnalysis of Linsker's Simulations of Hebbian Rules \n\n695 \n\nwhere  Q  is  the  covariance matrix  of activities  of the  inputs  to  the  neuron.  The \ncovariance matrix depends on  the  covariance function,  which  describes  the  depen(cid:173)\ndence of the covariance of two input cells' activities on their separation in the input \nfield,  and on the location of the synapses,  which is determined by a synaptic density \nfunction.  Linsker used  a  gaussian synaptic density function. \n\nDepending on the covariance function  and  the  two parameters kl  and k2'  different \nweight structures emerge.  Using a  gaussian  covariance function  (his layer  B -+- C), \nLinsker reported the emergence of non-trivial weight structures, ranging from satu(cid:173)\nrated structures through centre-surround structures to bi-Iobed oriented structures. \n\nThe  analysis  in  this  paper  examines  the  properties  of equation  (1).  We  concen(cid:173)\ntrate on the gaussian covariances in Linsker's layer B -+- C,  and give an explanation \nof  the  structures  reported  by  Linsker.  Several  of  the  results  are  more  general, \napplying  to  any  covariance  matrix  Q.  Space  constrains  us  to  postpone  general \ndiscussion,  and  criteria  for  the  emergence  of  centre-surround  weight  structures, \ntechnical  details,  and  discussion  of other  model  networks,  to  future  publications \n[MacKay, Miller,  1990]. \n\n1  ANALYSIS  IN  TERMS OF  EIGENVECTORS \nWe write equation (1)  as a  first  order differential equation for  the weight  vector w: \n\n(2) \nwhere J  is  the matrix J ij = 1 Vi, j, and n  is the DC vector ni = 1 Vi.  This equation \nis linear, up to the hard limits on Wi.  These hard limits define a hypercube in weight \nspace within which the dynamics are confined.  We  make the following assumption: \n\nAssumption 1  The  principal features  of the  dynamics  are  established  before  the \nhard  limits  are  reached.  When  the  hypercube  is  reached,  it  captures  and  preserves \nthe  existing weight  structure  with  little  subsequent  change. \n\nThe matrix Q+k2J  is symmetric, so it has a complete orthonormal set of eigenvectors2 \ne Ca) with real eigenvalues Aa.  The linear dynamics within the hypercube can be char(cid:173)\nacterised  in terms of these eigenvectors, each of which represents an independently \nevolving weight configuration.  First, equation (2)  has  a  fixed  point  at \n\n(3) \n\nSecond, relative to the fixed  point, the component of w  in the direction of an eigen(cid:173)\nvector  grows  or  decays  exponentially  at  a  rate  proportional  to the  corresponding \neigenvalue.  Writing wet) = :La wa(t)eCa ),  equation (2)  yields \n\nwa(t) - w:P  = (wa(O)  - w~p)e>'~t \n\n(4) \n\n2 The indices a  and b will be used to denote the eigenvector basis for w, while the indices i  and \n\nj  will be used for  the synaptic basis. \n\n\f696  MacKay and Miller \n\nThus,  the  principal  emergent  features  of the  dynamics  are  determined  by the fol(cid:173)\nlowing three factors: \n1.  The  principal  eigenvectors  of Q  + k 2J,  that  is,  the  eigenvectors  with  largest \npositive eigenvalues.  These are the fastest growing weight configurations. \n2.  Eigenvectors  of Q  + k 2 J  with  negative  eigenvalue.  Each  is  associated  with  an \nattracting constraint surface, the hyperplane defined  by Wa  = w!p. \n3.  The  location  of the  fixed  point  of equation  (1).  This  is  important  for  two \nreasons:  a)  it determines the location of the constraint surfaces;  b)  the fixed  point \ngives a  \"head start\"  to the growth rate of eigenvectors e(a)  for  which  Iw~PI is  large \ncompared  to IWa(O)I. \n2  EIGENVECTORS  OF Q \nWe first  examine  the eigenvectors and  eigenvalues of Q.  The principal eigenvector \nof Q  dominates the  dynamics of equation  (2)  for  kl  = 0,  k2  = O.  The subsequent \neigenvectors of Q  become  important as kl  and k2  are varied. \n\n2.1  PROPERTIES OF  CIRCULARLY SYMMETRIC SYSTEMS \n\nIf an operator commutes with the rotation operator,  its eigenfunctions can be writ(cid:173)\nten as  eigenfunctions of the rotation operator.  For Linsker's system,  in  the contin(cid:173)\nuum limit,  the operator Q + k2 J  is  unchanged under rotation of the system.  So the \neigenfunctions  of Q + k 2J  can  be  written  as the  product  of a  radial  function  and \none  of the angular functions  cosiO,  sinifJ,  1= 0,1,2 ...  To describe  these eigenfunc(cid:173)\ntions  we  borrow from  quantum mechanics  the  notation  n  = 1,2,3 ...  and  I = s,  p, \nd ...  to denote the total number of number of nodes  in the function  =  0,1,2 ...  and \nthe  number of angular  nodes = 0, 1,2 ...  respectively.  For example,  \"2s\"  denotes a \ncentre-surround function  with one radial node and no angular nodes (see figure  1). \n\nFor monotonic and non-negative covariance functions,  we conjecture that the eigen(cid:173)\nfunctions  of Q  are ordered  in  eigenvalue  by  their  numbers  of nodes such  that the \neigenfunction  [nl]  has  larger  eigenvalue  than  either  [en  + 1)/]  or  [n(1 + 1)].  This \nconjecture  is  obeyed  in all  analytical and  numerical results we  have obtained. \n2.2  ANALYTIC  CALCULATIONS FOR k2  = 0 \nWe have solved analytically for  the first  three eigenfunctions and eigenvalues of the \ncovariance  matrix  for  layer  8  -+  C  of Linsker's  network,  in  the  continuum  limit \nIs,  the  function  with no changes of sign,  is  the principal eigenfunction \n(Table 1). \nof Q;  2p,  the  bilobed  oriented  function,  is  the  second  eigenfunction;  and  2s,  the \ncentre-surround eigenfunction,  is third. 3 \n\nFigure l(a) shows the first  six eigenfunctions for  layer  B -+ C of [Linsker,  1986]. \n\n32s is  degenerate  with 3d at k2  =  O. \n\n\fAnalysis of Linsker's Simulations of Hebbian Rules \n\n697 \n\nTable 1:  The first  three eigenfunctions  of the operator Q(r, r') \n\nQ(r, r')  =  e-(r-r')2/ 2c e-r'2/2A,  where  C  and  A  denote  the  characteristic sizes  of \nthe  covariance function  and synaptic  density function.  r  denotes  two-dimensional \nspatial  position  relative  to  the  centre  of  the  synaptic  arbor,  and  r  =  Irl.  The \neigenvalues ~ are  all  normalised  by the effective number of synapses. \n\nName \n\nEigenfunction \n\n~/N \n\nIs \n2p \n2s \n\ne- r2 / 2R \n\nr cos Oe -r2/2R \n\nIC/A \n[2C/A \n(1  - r2/r5)e-r2/2R  13C/A \n\nR \n\nI \n\nr2 \no \n\n~ (1 + VI + 4A/C) \n\u00a5 \n\n(0 < 1<1) \n\n2A \n\nJl+4A/C \n\nFigure 1:  Eigenfunctions  of the operator Q + k2 J. \n\nLargest  eigenvalue  is  in  the  top row.  Eigenvalues  (in  arbitrary  units):  (a)  k2  =  0: \n(b)  k2  = -3:  2p,  1.0; \nIs,  2.26;  2p,  1.0;  2s  &  3d  (only  one  3d  is  shown),  0.41. \n2s,  0.66;  Is,  -17.8.  The greyscale  indicates  the  range  from  maximum negative  to \nmaximum  positive  synaptic  weight  within  each  eigenfunction.  Eigenfunctions  of \nthe operator (e-(r-r')2/ 2C +k2)e-r'2/2A were computed for CIA = 2/3 (as used by \nLinsker for  most  layer B --+  C simulations)  on  a  circle  of radius 12.5  grid  intervals, \nwith  VA = 6.15  grid  intervals. \n\n(~) \n\n(E3) \n\n\f698 \n\nMacKay and Miller \n\n3  THE EFFECTS OF THE PARAMETERS  kl  AND  k2 \nVarying k2  changes the eigenvectors and eigenvalues of the matrix Q + k2J.  Varying \nkl moves the fixed point of the dynamics with respect to the origin.  We now analyse \nthese two changes,  and their effects on the dynamics. \n\nDefinition:  Let  ii  be  the  unit  vector  in  the  direction  of the  DC  vector  n.  We \nrefer  to  (w . ii)  as  the  DC  component  of w.  The  DC  component  is  proportional \nto the sum of the synaptic strengths  in  a  weight  vector.  For example,  2p  and  all \nthe  other  eigenfunctions  with  angular  nodes  have  zero  DC  component.  Only  the \ns-modes  have  a  non-zero DC  component. \n\n3.1  GENERAL THEOREM:  THE EFFECT  OF k2 \n\nWe now characterise the effect of adding k 2J  to  any covariance matrix Q. \n\nTheorem 1  For any  covariance matrix Q,  the spectrum  of eigenvectors and  eigen(cid:173)\nvalues  of Q + k 2J  obeys  the  following: \n1.  Eigenvectors  of Q  with  no  DC component,  and  their  eigenvalues,  are  unaffected \nby  k 2 \u2022 \n2.  The  other eigenvectors,  with non-zero DC component,  vary with k 2 \u2022  Their eigen(cid:173)\nvalues  increase  continuously  and  monotonically  with  k2  between  asymptotic  limits \nsuch that the  upper limit of one eigenvalue is the  lower limit of the  eigenvalue  above. \n3.  There  is  at  most  one  negative  eigenvalue. \n4.  All but  one  of the  eigenvalues  remain  finite.  In  the  limits  k2  --+  \u00b1oo  there  is  a \nDC eigenvector ii with  eigenvalue  --+ k 2 N,  where  N  is  the  dimensionality  ofQ,  i.e. \nthe  number of synapses. \n\nThe properties stated in this theorem, whose proof is  in [MacKay,  Miller,  1990]' are \nsummarised pictorially by the spectral structure shown in figure  2. \n\n3.2 \n\nIMPLICATIONS FOR LINSKER'S SYSTEM \n\nFor  Linsker's  circularly  symmetric  systems,  all  the  eigenfunctions  with  angular \nnodes  have  zero  DC  component  and  are  thus  independent  of k 2 \u2022  The  eigenval(cid:173)\nues that vary with  k2  are those of the s-modes.  The leading s-modes at k2  = 0 are \nIs,  2s;  as  k2  is  decreased  to  -00,  these  modes  transform continuously into  2s,  3s \nrespectively (figure 2).4  Is becomes an eigenvector with negative eigenvalue,  and it \napproaches the DC  vector ii.  This eigenvector enforces a constraint w\u00b7 ii =  w FP . ii, \nand thus determines that the final average synaptic strength is equal to w FP . n/ N. \nLinsker  used  k2  =  -3 in  [Linsker,  1986].  This  value of k2  is  sufficiently  large  that \nthe  properties  of the  k2  --+  -00 limit  hold  [MacKay,  Miller,  1990]'  and  in  the fol(cid:173)\nlowing  we  concentrate interchangeably on  k2  = -3 and  k2  --+  -00.  The computed \neigenfunctions  for  Linsker's  system  at  layer  B --+  C  are  shown  in  figure  l(b)  for \n\n\u2022 The 2s eigenfunctions at k2  = 0 and k2  = - 00 both have one radial node, but are not identical \n\nfunctions. \n\n\fAnalysis of Linsker's Simulations of Hebbian Rules \n\n699 \n\nFigure 2:  General spectrum of eigenvalues of Q + k 2 J  as  a  function of k 2-\nA:  Eigenvectors  with  DC  component.  B:  Eigenvectors  with  zero  DC  component. \nC:  Adjacent  DC  eigenvalues  share  a  common  asymptote.  D:  There  is  only  one \nnegative eigenvalue. \nThe annotations in  brackets refer to the eigenvectors of Linsker's system. \n\n-:00 \n\n00: \n\n~  k2 \n. \n! \nn~ ... (~~2 ............................. ~ ................................................ .1 \n\nD \n\nk2  = -3.  The principal eigenfunction  is  2p.  The centre-surround eigenfunction 2s \nis the principal symmetric eigenfunction,  but it still has smaller eigenvalue than 2p. \n\n3.3  EFFECT  OF kl \n\nVarying kl  changes  the location of the fixed  point of equation (2).  From equation \n(3),  the fixed  point is displaced from the origin only in  the direction of eigenvectors \nthat  have  non-zero  DC  component,  that  is,  only  in  the  direction  of the  s-modes. \nThis has two important effects,  as discussed in section 1:  a)  The s-modes are given \na  head  start  in  growth  rate  that  increases  as  kl  is  increased.  In  particular,  the \nprincipal  s-mode,  the  centre-surround eigenvector  2s,  may  outgrow the  principal \neigenvector 2p.  b)  The constraint surface  is  moved  when  kl  is  changed.  For large \nnegative k2'  the  constraint surface  fixes  the average  synaptic  strength  in  the final \nweight  vector.  To  leading  order  in  1/k2'  Linsker  showed  that  the  constraint  is: \nL Wj  = kl/lk21\u00b75 \n3.4  SUMMARY OF THE EFFECTS  OF  kl  AND  k2 \n\nWe can now anticipate the explanation for  the emergence of centre--surround cells: \nFor  kl  = 0,  k2  = 0,  the  dynamics  are  dominated  by  Is.  The  centre-surround \n5To second order,  this  expression becomes L Wi  = kt/lk2 + ql,  where q = (QiJ)'  the  average \ncovariance (averaged over i  and j).  The additional term largely resolves  the discrepancy between \nLinsker's 9  and kt/lk21  in [Linsker, 1986]. \n\n\f700  MacKay and Miller \n\neigenfunction 2s  is  third in line  behind 2p,  the bi-Iobed function.  Making  k2  large \nand  negative  removes  Is  from  the  lead.  2p  becomes  the  principal  eigenfunction \nand  dominates  the  dynamics  for  kl  ~ 0,  so  that  the  circular  symmetry  is  bro(cid:173)\nken.  Finally,  increasing  kdlk21  gives  a  head start to the  centre-surround function \n2s.  Increasing  kdlk21  also  increases  the  final  average  synaptic  strength,  so  large \nkdlk21  also  produces  a  large  DC  bias.  The  centre-surround  regime  therefore  lies \nsandwiched between a  2p-dominated  regime and  an all-excitatory  regime.  kdlk21 \nhas  to  be  large enough  that  2s  dominates over 2p,  and  small enough  that  the  DC \nbias  does  not  obscure  the centre-surround structure.  We  estimate this  parameter \nregime in  [MacKay,  Miller,  1990],  and show that the boundary between the 2s- and \n2p-dominated regimes found by simulated annealing on the energy function may be \ndifferent from the boundary found by simulating the time-development of equation \n(1),  which depends on the  initial conditions. \n\nThe principal eigenvector of Q,  Is. \nThe  flat  DC  weight  vector,  which  leads  to  the  same  satu(cid:173)\nrated structures as  Is. \nThe principal eigenvector of Q + k2 J  for  k2  ---+  -00, 2p. \n\n4  CONCLUSIONS  AND  DISCUSSION \nFor Linsker's B  ---+  C connections,  we  predict four main parameter regimes for  vary(cid:173)\ning kl  and  k2.6  These regimes,  shown  in figure  3,  are dominated  by the following \nweight structures: \nk2  = 0,  kl = 0: \nk2  = large positive \nand/ or  kl  =  large \nk2  =  large negative, \nkl  ~ 0 \nk2  = large negative,  The principal circularly symmetric  function  which is  given \nkl  = intermediate \nHigher layers of Linsker's network can be analysed in terms of the same four regimes; \nthe  principal eigenvectors are altered, so that different structures can emerge.  The \ndevelopment of the interesting cells  in  Linsker's system depends on the  use  of neg(cid:173)\native synapses and on the use of the terms kl  and k2  to enforce a  constraint on the \nfinal  percentages of positive  and  negative synapses.  Both  of these  may  be  biolog(cid:173)\nically  problematic  [Miller,  1990].  Linsker suggested  that  the emergence  of centre(cid:173)\nsurround  structures  may depend  on  the  peaked  synaptic  density  function  that  he \nused  [Linsker,  1986,  page  7512].  However,  with  a  flat  density function,  the  eigen(cid:173)\nfunctions  are  qualitatively  unchanged,  and  centre-surround structures can  emerge \nby the same mechanism. \n\na  head start, 2s. \n\nAcknowledgements \n\nD.J.C.M. is  supported by a  Caltech Fellowship and a Studentship from SERe, UK. \n\nK.D.M.  thanks  M.  P.  Stryker for  encouragement  and  financial  support  while  this \nwork  was  undertaken.  K.D.M.  was supported  by  an  N .E.I.  Fellowship and  the  In-\n\n6not  counting the  symmetric regimes  (kl' k2)  .....  (-kl' k 2 )  in which all  the  weight  shuctures \n\nare inverted in sign. \n\n\fAnalysis of Linsker's Simulations of Hebbian Rules \n\n701 \n\nFigure 3:  Parameter regimes for  Linsker's system.  The DC  bias  is  approx(cid:173)\nimately constant along  the  radial lines,  so each  of the  regimes  with  large  negative \nk2  is  wedge-shaped. \n\n-8---8 -q  8---8-+ -k1 \n\n'. \n\n'. '. \n\n'. \n\n\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7 ... 8 \n\nternational Joint Research Project Bioscience Grant to M.  P. Stryker (T. Tsumoto, \nCoordinator) from  the N.E.D.O., Japan. \n\nThis collaboration  would have  been  impossible  without the  internet/NSF net,  long \nmay their  daemons flourish. \n\nReferences \n[Linsker,  1986]  R.  Linsker.  From Basic  Network  Principles  to  Neural  Architecture \n(series), PNAS  USA, 83, Oct.-Nov. 1986,  pp. 7508-7512, 8390-8394, \n8779-8783. \n\n[Linsker,  1988]  R.  Linsker.  Self-Organization  in  a  Perceptual  Network,  Computer, \n\nMarch 1988. \n\n[Miller,  1990]  K.D.  Miller. \n\n\"Correlation-based  mechanisms  of  neural  develop(cid:173)\n\nment,\"  in  Neuroscience  and Connectionist  Theory,  M.A.  Gluck and \nD.E. Rumelhart, Eds. (Lawrence Erlbaum Associates, Hillsboro NJ) \n(in press). \n\n[MacKay,  Miller,  1990]  D.J.C. MacKay and K.D. Miller.  \"Analysis ofLinsker's Sim(cid:173)\n\nulations of Hebbian rules\"  (submitted to Neural Computation);  and \n\"Analysis  of Linsker's  application  of Hebbian  rules  to  linear  net(cid:173)\nworks\"  (submitted to Network). \n\n\f", "award": [], "sourceid": 193, "authors": [{"given_name": "David", "family_name": "MacKay", "institution": null}, {"given_name": "Kenneth", "family_name": "Miller", "institution": null}]}