{"title": "Permitted and Forbidden Sets in Symmetric Threshold-Linear Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 217, "page_last": 223, "abstract": null, "full_text": "Permitted and  Forbidden Sets in \n\nSymmetric Threshold-Linear  Networks \n\nRichard  H.R. Hahnloser and H.  Sebastian Seung \n\nDept.  of Brain &  Cog.  Sci.,  MIT \n\nCambridge, MA  02139 USA \n\nrh~ai.mit.edu,  seung~mit.edu \n\nAbstract \n\nAscribing computational principles to neural feedback circuits is an \nimportant problem in theoretical neuroscience.  We  study symmet(cid:173)\nric  threshold-linear  networks  and  derive  stability  results  that  go \nbeyond the insights that  can  be  gained  from  Lyapunov theory or \nenergy functions.  By applying linear analysis to subnetworks com(cid:173)\nposed of coactive  neurons,  we  determine the  stability of potential \nsteady states.  We find that stability depends on two types of eigen(cid:173)\nmodes.  One  type  determines  global  stability  and  the  other  type \ndetermines whether or not multistability is possible.  We  can prove \nthe  equivalence  of  our  stability  criteria  with  criteria  taken  from \nquadratic  programming.  Also,  we  show  that  there  are  permitted \nsets  of neurons that  can be coactive at  a  steady state and forbid(cid:173)\nden sets that cannot.  Permitted sets are clustered in the sense that \nsubsets of permitted sets are permitted and supersets of forbidden \nsets  are forbidden.  By  viewing  permitted sets as memories  stored \nin the synaptic connections,  we  can provide a formulation of long(cid:173)\nterm memory that is more general than the traditional perspective \nof fixed  point attractor networks. \n\nA Lyapunov-function can be used to prove that a given set of differential equations is \nconvergent.  For  example,  if a  neural network possesses a  Lyapunov-function,  then \nfor  almost  any  initial  condition,  the  outputs  of the  neurons  converge  to  a  stable \nsteady  state.  In  the  past,  this  stability-property  was  used  to  construct  attractor \nnetworks  that  associatively  recall  memorized  patterns.  Lyapunov  theory  applies \nmainly to symmetric networks in which neurons have monotonic activation functions \n[1,  2].  Here  we  show that the restriction of activation functions  to threshold-linear \nones  is  not  a  mere  limitation,  but  can  yield  new  insights  into  the  computational \nbehavior of recurrent networks  (for  completeness, see  also  [3]). \n\nWe present three main theorems about the neural responses to constant inputs.  The \nfirst theorem provides necessary and sufficient conditions on the synaptic weight ma(cid:173)\ntrix for  the existence of a  globally  asymptotically stable set  of fixed  points.  These \nconditions can  be expressed in terms of copositivity,  a  concept from  quadratic pro(cid:173)\ngramming and linear complementarity theory.  Alternatively, they can be expressed \nin terms of certain eigenvalues and eigenvectors of submatrices of the synaptic weight \nmatrix, making a connection to linear systems theory.  The theorem guarantees that \n\n\fthe network will produce a steady state response to any constant input.  We  regard \nthis  response as the computational output of the network,  and its characterization \nis the topic of the second  and third theorems. \n\nIn the second theorem, we introduce the idea of permitted and forbidden sets.  Under \ncertain  conditions  on  the  synaptic  weight  matrix,  we  show  that  there  exist  sets \nof neurons that  are  \"forbidden\"  by the recurrent  synaptic  connections from  being \ncoactivated at a stable steady state, no matter what input is applied.  Other sets are \n\"permitted,\"  in  the  sense  that they  can be coactivated for  some  input.  The same \nconditions  on  the  synaptic  weight  matrix  also  lead  to  conditional  multistability, \nmeaning that there exists an input for  which there is  more than one  stable steady \nstate.  In other words, forbidden  sets and conditional multistability are inseparable \nconcepts. \n\nThe existence of permitted and forbidden sets suggests a new way of thinking about \nmemory in neural networks.  When an input is applied, the network must select a set \nof active neurons,  and this selection is  constrained to be one of the permitted sets. \nTherefore  the permitted  sets  can  be  regarded as  memories  stored  in  the  synaptic \nconnections. \n\nOur  third  theorem  states  that  there  are  constraints  on  the  groups  of  permitted \nand  forbidden  sets  that  can  be  stored  by  a  network.  No  matter  which  learning \nalgorithm is  used  to  store memories,  active  neurons  cannot  arbitrarily  be  divided \ninto  permitted  and  forbidden  sets,  because  subsets  of permitted  sets  have  to  be \npermitted and supersets of forbidden  sets have to be forbidden. \n\n1  Basic definitions \n\nOur theory is  applicable to the network dynamics \n\n+ \ndx\u00b7 - '  + x \u00b7 =  b\u00b7 + \"W\u00b7 \u00b7x \u00b7 \n1 \ndt \n\n[ \n'L...J \n\n' \n\n'J  J \n\nj \n\n(1) \n\nwhere  [u]+  =  maxi u, O}  is  a  rectification  nonlinearity  and  the  synaptic  weight \nmatrix  is  symmetric,  W ij  = W ji .  The  dynamics  can  also  be  written  in  a  more \ncompact matrix-vector form  as  :i; + x  = [b + W x]+.  The state of the network is  x. \nAn  input  to the  network  is  an  arbitrary  vector  b.  An  output  of the  network  is  a \nsteady state;!;. in  response to  b.  The existence of outputs and their relationship to \nthe input are determined  by the synaptic weight  matrix W. \nA vector v  is said to be nonnegative, v  ~ 0, if all of its components are nonnegative. \nThe  nonnegative orthant  {v  : v  ~ O}  is  the set  of all  nonnegative  vectors.  It can \nbe  shown  that  any  trajectory  starting in  the  nonnegative  orthant  remains  in  the \nnonnegative  orthant.  Therefore,  for  simplicity  we  will  consider  initial  conditions \nthat are confined  to the nonnegative orthant x  ~ O. \n\n2  Global asymptotic stability \n\nDefinition  1  A steady state;!;. is  stable if for  all initial conditions sufficiently close \nto ;!;.,  the state trajectory remains close to ;!;.  for  all later times. \n\nA  steady  state is  asymptotically  stable if for  all initial  conditions  sufficiently  close \nto ;!;.,  the state trajectory converges to ;!;.. \n\nA  set  of  steady  states  is  globally  asymptotically  stable  if  from  almost  all  initial \n\n\fconditions,  state trajectories converge to one  of the steady states.  Exceptions  are \nof measure zero. \n\nDefinition 2  A principal submatrix A of a square matrix B  is a square matrix that \nis  constructed by  deleting  a  certain  set  of rows  and the  corresponding columns  of \nB. \n\nThe  following  theorem  establishes  necessary  and  sufficient  conditions  on  W  for \nglobal asymptotic stability. \n\nTheorem 1  If W  is  symmetric,  then  the  following  conditions  are  equivalent: \n\n1.  All  nonnegative  eigenvectors  of  all  principal  submatrices  of I  - W  have \n\npositive  eigenvalues. \n\n2.  The matrix 1-W  is copositive.  That is,  xT (I - W)x > 0 for all nonnegative \n\nx,  except x  = O. \n\n3.  For  all  b,  the  network has  a nonempty set of steady  states  that  are  globally \n\nasymptotically  stable. \n\nProof sketch: \n\n\u2022  (1)  ~ (2).  Let  v*  be the minimum  of vT(I - W)v  over  nonnegative  v  on \nthe  unit  sphere.  If (2)  is  false,  the  minimum  value  is  less  than  or  equal \nto  zero.  It follows  from  Lagrange  multiplier  methods  that  the  nonzero \nelements  of  v*  comprise  a  nonnegative  eigenvector  of the  corresponding \nprincipal submatrix of W  with eigenvalue greater than or equal to unity. \n\u2022  (2)  ~ (3).  By the copositivity off - W, the function L = ~xT (I - W)x-bT X \nis  lower  bounded  and  radially  unbounded.  It is  also  nonincreasing under \nthe  network  dynamics  in  the  nonnegative  orthant,  and  constant  only  at \nsteady states.  By the Lyapunov stability theorem, the stable steady states \nare globally asymptotically stable.  In the language of optimization theory, \nthe  network  dynamics  converges  to  a  local  minimum  of L  subject  to  the \nnonnegativity constraint  x  ~ O. \n\n\u2022  (3)  ~ (1).  Suppose  that  (1)  is  false.  Then  there  exists  a  nonnegative \neigenvector of a  principal submatrix of W  with eigenvalue greater than or \nequal to  unity.  This  can  be  used  to  construct  an unbounded trajectory of \nthe dynamics .\u2022 \n\nThe  meaning  of these  stability  conditions  is  best  appreciated  by  comparing  with \nthe  analogous  conditions  for  the  purely  linear  network  obtained  by  dropping  the \nrectification from  (1).  In  a  linear  network,  all  eigenvalues  of W  would  have  to  be \nsmaller than unity to ensure asymptotic stability.  Here only nonnegative eigenvec(cid:173)\ntors  are  able  to  grow  without  bound,  due  to  the  rectification,  so  that  only  their \neigenvalues  must  be less  than unity.  All  principal submatrices of W  must  be  con(cid:173)\nsidered, because different sets of feedback  connections are active,  depending on the \nset of neurons that are above threshold.  In a  linear network,  I  - W  would  have to \nbe positive definite to ensure asymptotic stability,  but  because of the rectification, \nhere this condition is  replaced by the weaker condition of copositivity. \nThe conditions of Theorem 1 for global asymptotic stability depend only on W, but \nnot on  b.  On the other hand,  steady states do  depend on b.  The next lemma says \nthat the mapping from  input to output is  surjective. \n\n\fLemma 1  For  any nonnegative  vector v  2::  0  there  exists  an  input b,  such  that v  is \na steady  state  of equation  1 with  input b. \nProof:  Define  c =  v-1::W1::v, where 1::  =  diag(rTl, ... ,rTN)  and rTi  =  1 if Vi  > 0 and \nrTi  =  0 if Vi  =  O.  Choose  bi  =  Ci  for  Vi  > 0 and bi  =  -1 -\nThis  Lemma  states  that  any  nonnegative  vector  can  be  realized  as  a  fixed  point. \nSometimes this fixed  point  is  stable,  such  as  in  networks  subject  to  Theorem  1 in \nwhich  only  a  single  neuron  is  active.  Indeed,  the  principal  submatrix  of I  - W \ncorresponding to a  single  active neuron corresponds to a  diagonal elements,  which \naccording  to  (1)  must  be  positive.  Hence  it  is  always  possible  to  activate  only  a \nsingle  neuron  at  an  asymptotically  stable  fixed  point.  However,  as  will  become \nclear from  the following Theorem, not all nonnegative vectors can be realized as an \nasymptotically stable fixed  point. \n\n(1::W1::V)i  for  Vi  =  0 .\u2022 \n\n3  Forbidden and permitted sets \n\nThe following  characterizations of stable steady states are based on the interlacing \nTheorem  [4].  This Theorem says that if A  is  an - 1 by n  - 1 principal submatrix \nof a  n  by  n  symmetric  matrix  B,  then  the  eigenvalues  of A  fall  in  between  the \neigenvalues  of B.  In particular, the largest eigenvalue  of A  is  always  smaller  than \nthe largest eigenvalue of B. \n\nDefinition 3  A  set  of neurons  is  permitted  if the  neurons  can  be  coactivated  at \nan  asymptotically  stable  steady  state  for  some  input  b.  On  the  other  hand,  a  set \nof neurons  is  forbidden,  if they  cannot  be  coactivated  at  an  asymptotically  stable \nsteady  state  no  matter what the  input b. \n\nAlternatively,  we  might  have  defined  a  permitted  set  as  a  set  for  which  the  corre(cid:173)\nsponding square sub-matrix of I  - W  has only positive eigenvalues.  And,  similarly, \na forbidden set could be defined as  a set for  which there is  at least one non-positive \neigenvalue.  It follows  from  Theorem 1 that if the matrix I  - W  is  copositive,  then \nthe eigenvectors corresponding to non-positive eigenvalues of forbidden sets have to \nhave both positive and non-positive components. \n\nTheorem 2  If the  matrix I  - W  is  copositive,  then  the  following  statements  are \nequivalent: \n\n1.  The  matrix I  - W  is  not positive  definite. \n\n2.  There  exists  a forbidden  set. \n\n3.  The  network  is  conditionally  multistable.  That  is,  there  exists  an  input b \n\nsuch  that there  is  more  than  one  stable  steady  state. \n\nProof sketch: \n\n\u2022  (1)  =>  (2) .  I  - W  is not positive definite and so there can be no asymptot(cid:173)\nically stable steady state in which all  neurons are  active, e.g.  the set of all \nneurons is  forbidden . \n\n\u2022  (2)  =>  (3).  Denote the forbidden  set  with  k  active neurons by 1::.  Without \nloss of generality, assume that the principal submatrix of I - W  correspond(cid:173)\ning to 1::  has k - 1 positive eigenvalues and only one non-positive eigenvalue \n(by  virtue  of the  interlacing  theorem  and  the  fact  that  the  diagonal  ele(cid:173)\nments of I  - W  must be positive, there is  always a  subset  of 1::,  for  which \n\n\fthis is true).  By choosing bi  > 0 for neurons i  belonging to 1; and bj  \u00ab 0 for \nneurons j  not  belonging to  1;,  the  quadratic  Lyapunov function  L  defined \nin  Theorem  1  forms  a  saddle  in  the  nonnegative  quadrant  defined  by  1;. \nThe saddle point is  the point where L  restricted to the hyperplane defined \nby the k - 1 positive eigenvalues reaches its minimum.  But because neurons \ncan be initialized to lower values of L  on either side of the hyperplane and \nbecause L  is  non-increasing along trajectories, there is  no  way trajectories \ncan  cross  the  hyperplane.  In  conclusion,  we  have  constructed  an  input  b \nfor  which the network is  multistable. \n\n\u2022  (3)  =>  (1).  Suppose that  (1)  is  false.  Then for  all  b the Lyapunov function \nL  is  convex and so  has only a  single local minimum in the convex domain \nx  ~ O.  This  local  minimum  is  also  the  global  minimum.  The  dynamics \nmust converge to this minimum .\u2022 \n\nIf I - W  is positive definite, then a symmetric threshold-linear network has a unique \nsteady state.  This has been shown previously [5].  The next Theorem is an expansion \nof this result, stating an equivalent condition using the concept of permitted sets. \n\nTheorem 3  If W  is  symmetric,  then  the  following  conditions  are  equivalent: \n\n1.  The  matrix I  - W  is positive  definite. \n\n2.  All sets  are  permitted. \n\n3.  For  all  b there  is  a unique  steady  state,  and it is  stable. \n\nProof: \n\n\u2022  (1)  =>  (2).  If I  - W  is  positive  definite,  then  it  is  copositive.  Hence  (1) \nin  Theorem  2  is  false  and  so  (2)  in  Theorem  2  is  false,  e.g.  all  set  are \npermitted. \n\n\u2022  (2)  =>  (1).  Suppose  (1)  is  false,  so  the  set  of all  neurons  active  must  be \n\nforbidden,  not all sets are permitted. \n\n\u2022  (1)  {:::=>  (3).  See  [5] .\u2022 \n\nThe following Theorem characterizes the forbidden  and the permitted sets. \n\nTheorem 4  Any subset of a permitted set is permitted.  Any superset of a forbidden \nset is forbidden. \n\nProof:  According to the interlacing Theorem, if the smallest eigenvalue of a  sym(cid:173)\nmetric  matrix  is  positive,  then  so  are  the  smallest  eigenvalues  of all  its  principal \nsubmatrices.  And,  if the  smallest  eigenvalue  of a  principal  submatrix is  negative, \nthen so  is  the smallest eigenvalue of the original matrix .\u2022 \n\n4  An example - the ring  network \n\nA symmetric threshold-linear network with local excitation and larger range inhibi(cid:173)\ntion has been studied in the past as a  model for  how simple cells  in  primary visual \ncortex obtain their orientation tuning to visual stimulation [6,  7].  Inspired by these \nresults, we have recently built an electronic circuit containing a ring network, using \nanalog  VLSI  technology  [3].  We  have  argued  that  the  fixed  tuning  width  of the \nneurons  in  the  network  arises  because  active  sets  consisting  of more  than  a  fixed \n\n\fnumber of contiguous neurons are forbidden.  Here we  give a more detailed account \nof this  fact  and  provide  a  surprising  result  about  the  existence  of some  spurious \npermitted sets. \n\nLet  the  synaptic  matrix of a  10  neuron  ring-network be  translationally invariant. \nThe connection between neurons i  and j  is given by Wij = -(3 +o:oclij +  0:1 (cli,j+l + \ncli+l,j) +  0:2 (cli,j+2  +  cli+2,j), where (3  quantifies global inhibition,  0:0  self-excitation, \n0:1  first-neighbor  lateral  excitation  and  0:2  second-neighbor  lateral  excitation.  In \nFigure  1  we  have  numerically  computed  the  permitted  sets  of this  network,  with \nthe  parameters taken from  [3],  e.g.  0:0  =  0  0:1  =  1.1  0:2  =  1 (3  =  0.55.  The  per(cid:173)\nmitted sets were  determined by diagonalising the 210  square sub-matrices of I - W \nand by  classifying the eigenvalues  corresponding to nonnegative eigenvectors.  The \nFigure  1 shows  the  resulting  parent  permitted sets  (those that  have  no  permitted \nsupersets).  Consistent with the finding that such ring-networks can explain contrast \ninvariant tuning of VI cells and multiplicative response modulation of parietal cells, \nwe  found  that there  are  no  permitted sets  that  consist  of more than 5  contiguous \nactive neurons.  However,  as  can be seen, there are many non-contiguous permitted \nsets that could in principle be activated  by exciting neurons in  white  and strongly \ninhibiting neurons in  black. \n\nBecause the activation of the spurious permitted sets requires highly  specific  input \n(inhibition  of high  spatial  frequency),  it  can  be  argued  that  the  presence  of the \nspurious  permitted  sets  is  not  relevant  for  the  normal  operation  of the  ring  net(cid:173)\nwork, where inputs are typically tuned and excitatory (such as inputs from LGN to \nprimary visual cortex). \n\n..... \nQ.) \n.0 \nE \n:::l \nC \n\n\"0 \n\nID en \n-~ \nE \nQ.) a.... \n\nNeuron number \n\nNeuron number \n\nFigure  1:  Left:  Output of a  ring network of 10  neurons to uniform  input  (random \ninitial  condition).  Right:  The  9  parent  permitted  sets  (x-axis:  neuron  number, \ny-axis:  set number).  White means that a neurons belongs to a set and black means \nthat it does not.  Left-right and translation symmetric parent permitted sets of the \nones shown have  been  excluded.  The first  parent permitted set  (first  row from  the \nbottom)  corresponds to the output on the left. \n\n5  Discussion \n\nWe  have  shown  that  pattern  memorization  in  threshold  linear  networks  can  be \nviewed  in  terms  of  permitted  sets  of  neurons,  e.g.  sets  of neurons  that  can  be \ncoactive at a  steady state.  According to this definition,  the memories are stored by \nthe synaptic weights,  independently of the inputs.  Hence,  this  concept of memory \ndoes  not  suffer  from  input-dependence,  as  would  be  the  case  for  a  definition  of \n\n\fmemory based on the fixed  points of the dynamics. \n\nPattern retrieval is strongly constrained by the input.  A typical input will not allow \nfor  the retrieval of arbitrary stored permitted sets.  This  comes from  the fact  that \nmultistability is  not just  dependent  on the existence of forbidden  sets,  but also on \nthe input  (theorem 2).  For example, in the ring network, positive input will always \nretrieve permitted sets consisting of a  group of contiguous neurons, but not  any of \nthe spurious permitted sets,  Figure 1.  Generally, multistability in the ring network \nis  only possible when more than a  single neuron is  excited. \n\nNotice that threshold-linear networks can behave as traditional attractor networks \nwhen the inputs are represented as initial conditions of the dynamics.  For example, \nby fixing b =  1 and initializing a copositive network with some input, the permitted \nsets unequivocally determine the stable fixed points.  Thus, in this case, the notion of \npermitted sets is  no different from fixed  point attractors.  However, the hierarchical \ngrouping of permitted sets (Theorem 4)  becomes irrelevant, since there can be only \none attractive fixed  point per hierarchical group defined  by a  parent permitted set. \n\nThe fact that no  permitted set can have a forbidden  subset  represents a  constraint \non the possible computations of symmetric networks.  However, this constraint does \nnot have to be viewed  as an undesired  limitation.  On the contrary,  being  aware of \nthis constraint may lead to a  deeper understanding of learning algorithms and rep(cid:173)\nresentations for  constraint satisfaction problems.  We  are reminded of the history of \nperceptrons, where the insight that they can only solve linearly separable classifica(cid:173)\ntion problems led to the invention of multilayer perceptrons and backpropagation. \nIn a  similar way,  grouping problems that do  not  obey the natural hierarchy inher(cid:173)\nent in symmetric networks, might necessitate the introduction of hidden neurons to \nrealize the right  geometry.  For the interested reader,  see  also  [8]  for  a  simple pro(cid:173)\ncedure of how to store a given family  of possibly overlapping patterns as permitted \nsets. \n\nReferences \n[1]  J. J. Hopfield.  Neurons with graded response have collective properties like those \n\nof two-state neurons.  Proc.  Natl.  Acad.  Sci.  USA,  81:3088- 3092, 1984. \n\n[2]  M.A. Cohen and S. Grossberg. Absolute stability of global pattern formation and \nparallel memory storage by competitive neural networks.  IEEE  Transactions  on \nSystems,  Man  and  Cybernetics,  13:288- 307,1983. \n\n[3]  Richard H.R.  Hahnloser,  Rahul Sarpeshkar, Misha Mahowald,  Rodney  J . Dou(cid:173)\nglas,  and Sebastian Seung.  Digital  selection  and  ananlog  amplification  coexist \nin  a  silicon  circuit inspired by cortex.  Nature,  405:947- 51,  2000. \n\n[4]  R.A.  Horn  and  C.R.  Johnson.  Matrix  analysis.  Cambridge  University  Press, \n\n1985. \n\n[5]  J.  Feng  and  K.P.  Hadeler.  Qualitative  behaviour of some  simple  networks.  J. \n\nPhys.  A:,  29:5019- 5033, 1996. \n\n[6]  R. Ben-Yishai, R. Lev Bar-Or, and H. Sompolinsky. Theory of orientation tuning \n\nin  visual cortex.  Proc.  Natl.  Acad.  Sci.  USA,  92:3844- 3848, 1995. \n\n[7]  R.J. Douglas, C.  Koch,  M.A.  Mahowald, K.A.C.  Martin, and H.  Suarez.  Recur(cid:173)\n\nrent excitation in neocortical circuits.  Science, 269:981- 985,  1995. \n\n[8]  Xie  Xiaohui,  Richard  H.R.  Hahnloser,  and Sebastian Seung.  Learning  winner(cid:173)\ntake-all  competition  between  groups of neurons  in  lateral inhibitory  networks. \nIn  Proceedings  of NIPS2001  - Neural Information  Processing  Systems:  Natural \nand Synthetic,  2001. \n\n\f", "award": [], "sourceid": 1793, "authors": [{"given_name": "Richard", "family_name": "Hahnloser", "institution": null}, {"given_name": "H. Sebastian", "family_name": "Seung", "institution": null}]}