{"title": "Learning to Predict Visibility and Invisibility from Occlusion Events", "book": "Advances in Neural Information Processing Systems", "page_first": 816, "page_last": 822, "abstract": null, "full_text": "Learning to Predict \n\nVisibility and Invisibility \n\nfrom  Occlusion Events \n\nJonathan A.  Marshall \n\nRichard K.  Alley \n\nRobert S.  Hubbard \n\nDepartment of Computer Science,  CB  3175, Sitterson  Hall \n\nUniversity of North Carolina, Chapel Hill, NC  27599-3175, U.S.A. \n\nmarshall@cs.unc.edu, 919-962-1887, fax  919-962-1799 \n\nAbstract \n\nVisual  occlusion  events  constitute  a  major source  of depth  information. \nThis paper presents a self-organizing neural network that learns to detect, \nrepresent,  and predict the visibility and invisibility relationships that arise \nduring  occlusion  events,  after  a  period  of exposure  to  motion sequences \ncontaining occlusion  and  disocclusion events.  The  network develops  two \nparallel opponent  channels  or  \"chains\"  of lateral  excitatory  connections \nfor  every  resolvable  motion trajectory.  One  channel,  the  \"On\"  chain or \n\"visible\"  chain, is  activated when a moving stimulus is visible.  The other \nchannel, the  \"Off\"  chain or  \"invisible\"  chain,  carries  a persistent,  amodal \nrepresentation that predicts the motion of a formerly visible stimulus that \nbecomes  invisible  due  to occlusion.  The  learning  rule  uses  disinhibition \nfrom  the  On  chain  to  trigger  learning  in  the  Off  chain.  The  On  and \nOff chain  neurons  can  learn  separate  associations  with  object  depth  or(cid:173)\ndering.  The results  are closely  related  to the  recent  discovery  (Assad  & \nMaunsell,  1995)  of neurons in macaque monkey posterior  parietal cortex \nthat respond selectively to inferred  motion of invisible stimuli. \n\n1 \n\nINTRODUCTION:  LEARNING ABOUT  OCCLUSION \nEVENTS \n\nVisual  occlusion  events  constitute  a  major source  of depth  information.  Yet  lit(cid:173)\ntle  is  known  about  the  neural  mechanisms  by  which  visual  systems  use  occlusion \nevents  to infer  the  depth  relations  among visual objects.  What is  the structure  of \nsuch  mechanisms?  Some possible  answers  to this question  are revealed  through  an \nanalysis of learning rules that can cause  such  mechanisms to self-organize. \n\nEvidence  from  psychophysics  (Kaplan,  1969;  Nakayama  &  Shimojo,  1992; \nNakayama,  Shimojo,  &  Silverman,  1989;  Shimojo,  Silverman,  &  Nakayama, \n1988,  1989;  Yonas,  Craton,  &  Thompson,  1987)  and  neurophysiology  (Assad  & \nMaunsell,  1995;  Frost,  1993)  suggests  that  the  process  of  determining  relative \ndepth  from  occlusion  events  operates  at  an early stage  of visual  processing.  Mar(cid:173)\nshall  (1991)  describes  evidence  that suggests  that the same early  processing  mech(cid:173)\nanisms maintain a representation of temporarily occluded objects for  some amount \n\n\fLearning to  Predict Visibility  and  Invisibility  from  Occlusion  Events \n\n817 \n\nof time after  they  have  disappeared  behind  an  occluder,  and  that  these  represen(cid:173)\ntations of invisible objects  interact  with other  object  representations,  in  much  the \nsame  manner  as  do  representations  of visible  objects.  The  evidence  includes  the \nphenomena of kinetic subjective contours (Kellman & Cohen,  1984), motion viewed \nthrough a slit (Parks' Camel)  (Parks,  1965) , illusory occlusion  (Ramachandran, In(cid:173)\nada, & Kiama, 1986) , and interocular occlusion sequencing  (Shimojo, Silverman, & \nNakayama, 1988). \n2  PERCEPTION OF OCCLUSION  AND \n\nDISOCCLUSION  EVENTS:  AN ANALYSIS \n\nThe neural network model exploits the visual changes that occur at occlusion bound(cid:173)\naries to form a mechanism for detecting and representing object visibility/invisibility \ninformation.  The set of learning  rules  used  in  this model is  an extended  version  of \none  that has  been  used  before  to describe  the  formation of neural  mechanisms for \na  variety  of other  visual  processing  functions  (Hubbard  &  Marshall,  1994;  Mar(cid:173)\nshall,  1989,  1990ac,  1991,  1992;  Martin &  Marshall,  1993). \n\nOur  analysis  is  derived  from  the  following  visual  predictivity principle,  which \n\nmay be postulated  as  a  fundamental principle  of neural organization in visual sys(cid:173)\ntems:  Visual  systems  represent the  world in  terms of predictions of its  appearance, \nand they reorganize themselves  to generate  better predictions.  To maximize the cor(cid:173)\nrectness  and completeness of its predictions, a visual system  would  need  to predict \nthe motions and visibility/invisibility of all objects in a scene.  Among other things, \nit would  need  to predict the disappearance of an object moving behind an occluder \nand the reappearance  of an object emerging from  behind  an occluder. \n\nA  consequence  of this  postulate  is  that occluded  objects  must,  at  some  level, \n\ncontinue  to  be  represented  even  though  they  are  invisible.  Moreover,  the  repre(cid:173)\nsentation of an  object  must  distinguish  whether  the  object  is  visible  or  invisible; \notherwise, the visual system could not determine whether its representations predict \nvisibility  or  invisibility,  which  would  contravene  the  predictivity  principle.  Thus, \nsimple single-channel prediction schemes like the one  described  by  Marshall (1989, \n1990a)  are inadequate to represent  occlusion  and disocclusion events. \n3  A  MODEL FOR GROUNDED LEARNING TO \n\nPREDICT VISIBILITY AND  INVISIBILITY \n\nThe initial structure  of the  Visible/Invisible network  model is  given  in  Figure  1A. \nThe network self-organizes in  response  to a  training regime containing many input \nsequences  representing motion with and without occlusion and disocclusion events. \nAfter  a  period  of self-organization,  the  specific  connections  that  a  neuron  receives \n(Figure  1B)  determine whether it responds  to visible or invisible objects.  A neuron \nthat  responds  to  visible  objects  would  have  strong  bottom-up input  connections, \nand it would  also have strong time-delayed lateral excitatory input connections.  A \nneuron that responds selectively to invisible objects would  not have strong bottom(cid:173)\nup connections, but it would have strong lateral excitatory input connections.  These \nlateral inputs would transmit to the neuron evidence that a previously visible object \nexisted.  The neurons  that respond  to invisible objects  must operate in  a  way  that \nallows lateral  input  excitation  alone  to activate  the  neurons  supraliminally, in  the \nabsence of bottom-up input excitation from  actual visible objects. \n4  SIMULATION OF A  SIMPLIFIED  NETWORK \n4.1 \n\nINITIAL  NETWORK STRUCTURE \n\nThe  simulated  network,  shown  in  Figure  2,  describes  a  simplified  one(cid:173)\ndimensional  subnetwork  (Marshall  &  Alley,  1993)  of  the  more  general  two(cid:173)\ndimensional  network.  Layer  1  is  restricted  to  a  set  of motion-sensitive  neurons \ncorresponding  to one rightward motion trajectory. \n\nThe  L+  connections  in  the  simulation  have  a  signal  transmission  latency  of \none  time unit.  Restricting  the lateral connections  to  a  single  time delay  and  to a \nsingle direction limits the simulation to representing  a single speed  and direction of \nmotion; these results are therefore preliminary. This restriction reduced the number \nof connections  and  made the simulation much faster. \n\n\f818 \n\nJ. A. MARSHALL, R.  K. ALLEY, R. S. HUBBARD \n\n(A) \n\n0 \n\n0 ,'9 \n\n(B)\n\n. \n\nFigure  1:  Model  of  a self-organized  occlusion-event  detector network.  (A)  Network  is  initially \norganized  nonspecifically,  so  that  each  neuron  receives  roughly  homogeneous input  connections: \nfeedforward, bottom-up excitatory (\"B+\") connections from a preprocessing stage of motion-tuned \nneurons (bottom-up solid arrows), lateral inhibitory (\"L-\") connections (dotted arrows), and time(cid:173)\ndelayed  lateral  excitatory  (\"L+\")  connections  (lateral  solid  arrows) .  (B)  After exposure  during \na developmental  period  to  many motion  sequences containing occlusion  and  disocclusion  events, \nthe  network  learns  a  highly  specific  connection structure.  The  previously  homogeneous  network \nbifurcates into two parallel opponent channels for every resolvable motion trajectory: some neurons \nkeep their bottom-up connections and others lose them . The channels for one trajectory are shown . \nNeurons  from  the  two  opponent  channels  are  strongly  linked  by  lateral  inhibitory  connections \n(dotted arrows).  Time-delayed lateral excitatory connections cause stimulus information (priming \nexcitation, or  \"prediction signals\")  to  propagate along the channels. \n\nLayer 2 \n\nFigure  2:  Simula.tion  results.  (Left) Simulated  network structure before  training.  Neurons are \nwired  homogeneously from  the input la.yer.  (Right)  After training, some of the neurons lose  their \nbottom-up input connections. \n\nLayer 1 \n\n4.2  USING  DISINHIBITION  TO  CONTROL THE  LEARNING  OF \n\nOCCLUSION RELATIONS \n\nThis  paper  describes  one  method  for  learning  occlusion  relations.  Other \n\nmethods  may  also  work.  The  method  involves  extending  the  EXIN  (excita(cid:173)\ntory+inhibitory)  learning scheme  described  by  Marshall  (1992,  1995).  The  EXIN \nscheme uses  a  variant of a  Hebb rule to govern learning in the bottom-up and time(cid:173)\ndelayed lateral excitatory connections, plus an anti-Hebb rule to govern learning in \nthe lateral inhibitory connections. \n\nThe EX IN  system was extended by letting inhibitory connections exert a  disin(cid:173)\n\nhibitory effect under certain regulated conditions.  The disinhibition rule was chosen \nbecause it constitutes  a  simple way that the  unexpected  failure of a  neuron to be(cid:173)\ncome activated (e.g., when an object disappears behind an occluder) can cause some \nother neuron to become activated.  That other neuron can then learn, becoming se(cid:173)\nlective for  invisible object  motion.  Thus,  the  representations  of visible objects  are \nprotected  from losing their bottom-up input connections during occlusion events. \nIn  this  way,  the  network  can  learn  separate  representations  for  visible  and  in(cid:173)\n\nvisible stimuli.  The representations  of invisible objects are  allowed to develop only \nto  the  extent  that  the  neurons  representing  visible  objects  explicitly  disclaim the \n\"right\"  to  represent  the  objects.  These  properties  prevent  the  network  from  los(cid:173)\ning  complete  grounded  contact  with  actual  bottom-up  visual  input,  while  at  the \nsame time allowing  some neurons to lose their direct bottom-up input connections. \nThe disinhibition produces  an excitatory  response  at the target  neurons .  Dis(cid:173)\ninhibition is  generated  according to the following rule:  When  a  neuron  has  strong, \n\n\fLearning  to Predict Visibility  and  Invisibility  from  Occlusion Events \n\n819 \n\nactive  lateral  excitatory input  connections  and  strong  but  inactive  bottom-up  input \nconnections,  then  it  tends  to  disinhibit  the  neurons  to  which  it  projects  inhibitory \nconnections.  This implements a type  of differencing operation between  lateral and \nbottom-up excitation.  Because  the  disinhibition tends  to excite  the  recipient  neu(cid:173)\nrons,  it causes one  (or possibly more) of the recipient neurons to become active and \nthereby enables that neuron to learn. \n\nThe lateral excitation that  a  neuron  receives  can  be  viewed  as  a  prediction  of \nthe  neuron's  activation.  If that  prediction  is  not  matched  by  actual  bottom-up \nexcitation,  then  a  shortfall  (prediction  failure)  has  occurred,  probably  indicating \nan occlusion event. \n\nEach  neuron's disinhibition input was  combined with its bottom-up excitatory \n\ninput  and  its  lateral  excitatory  input  to form  a  total  excitatory  input  signal.  Ei(cid:173)\nther bottom-up excitation or disinhibition alone could contribute toward a neuron's \nexcitation.  However,  lateral  excitation could  merely  amplify the other signals and \ncould not alone excite the neuron.  This prevented neurons from learning in response \nto lateral excitation  alone. \n4.3  DISINH.IBITION  LETS  THE  NETWORK  LEARN  TO \n\nRESPOND TO  INVISIBLE  OBJECTS \n\nDuring  continuous  motion  sequences,  without  occlusion  or  disocclusion,  the \n\nsystem operates similarly to a system with the standard EXIN learning rules (Mar(cid:173)\nshall,  1990b,  1995):  lateral  excitatory  \"chains\"  of connections  are  learned  across \nsequences  of neurons along a motion trajectory.  Marshall (1990a)  showed that such \nchains form in 2-D networks with multiple speeds and multiple directions of motion. \n\nDuring occlusion  events,  some  predictive  lateral  excitatory  signals  reach  neu(cid:173)\n\nrons  that have strong but inactive bottom-up excitatory connections.  The neurons \nreached  by  this excitation pattern disinhibit,  rather than inhibit, their  competitor \nneurons.  Over the  course  of many occlusion events,  such  neurons  become  increas(cid:173)\ningly selective for  the inferred  motion of an  invisible object:  their bottom-up input \nconnections weaken,  and their lateral inhibitory input connections  strengthen. \n\nMore  than  one  neuron  receives  L+  signals  after  every  neuron  activation;  the \n\nrecipients  of each  neuron's  L+  output  connections  represent  the  (learned)  possi(cid:173)\nble sequents of the neuron's  activation.  But at most one of those sequents  actually \nreceives  both  B+  and  L+  signals:  the  one  that  corresponds  to  the  actual  stimu(cid:173)\nlus.  This winner neuron  receives  the disinhibition from the other neurons  receiving \nL+ excitation;  its competitive advantage over  the other neurons  is  thus reinforced. \n4.4  SIMULATION TRAINING \n\nThe sequences  of input training data consisted of a single visual feature moving \nwith constant  velocity  across  the  I-D visual field.  When  this stimulus was  visible, \nits  presence  was  indicated  by  strong  activation  of  an  input  neuron  in  Layer  1. \nWhile occluded,  the stimulus would produce no activation in Layer 1.  The stimulus \noccasionally disappeared  \"behind\"  an occluder  and reappeared  at a later time and \nspatial position farther along the same trajectory.  After some duration, the stimulus \nwas  removed  and  replaced  by  a new  stimulus.  The starting positions  and  lifetimes \nof the stimuli and occluders were  varied  randomly within  a fixed  range. \n\nThe network was trained for 25,000 input pattern presentations.  The stability of \nthe connection weights was  verified  by additional training for  50,000 presentations. \n4.5  SIMULATION RESULTS:  ARCHITECTURE \n\nThe second  stage  of neurons  gradually  underwent  a  self-organized  bifurcation \ninto two distinct pools of neurons,  as shown in Figure 2B. These pools consist of two \nparallel  opponent  channels  or  \"chains\"  of lateral  excitatory  connections  for  every \nresolvable  motion trajectory.  One channel,  the  \"On\"  chain or  \"visible\"  chain,  was \nactive when  a moving stimulus became visible.  The other channel,  the  \"Off\"  chain \nor  \"invisible\"  chain,  was  active when  a formerly  visible stimulus became  invisible. \nThe  model  is  thus  named  the  Visible/Invisible  model.  The  bifurcation  may  be \nanalogous to the  activity-dependent  stratification of cat retinal  gan~lion cells  into \nseparate On and  Off layers,  described  by Bodnarenko  and  Chalupa (1993). \n4.6  SIMULATION RESULTS:  OPERATION \n\nThe On chain carries  a predictive  modal  representation of the visible stimulus. \nThe Off chain carries  a persistent,  amodal  representation  that predicts the motion \n\n\fOn channel. \n\n\u2022  When  the stimulus became  invisible,  its representation  was  carried  in  the \n\nOff channel.  The Off channel did not  become active  until  the visible stim(cid:173)\nulus disappeared. \n\n\u2022  The  activations representing  the visible stimulus became stronger  (toward \nan  asymptote)  at successive  spatial positions,  because  of the  propagation \nof accumulating evidence for  the  presence  of the stimulus (Martin &  Mar(cid:173)\nshall,  1993). \n\n\u2022  The  activation  representing  the  invisible  stimulus  decayed  at  successive \nspatial positions.  Thus,  representations  of invisible stimuli did  not  remain \nactive indefinitely. \n\n\u2022  When the  stimulus reappeared  (after  a  sufficiently  brief occlusion),  its  ac(cid:173)\n\n820 \n\nJ.  A.  MARSHALL, R. K.  ALLEY, R.  S.  HUBBARD \n\nof the invisible stimulus.  The shading of the neurons in Figure 3 shows the neuron \nacti vat ions of the final,  trained network simulation during an occlusion-disocclusion \nsequence.  The following noteworthy behaviors  were observed  in the test. \n\n\u2022  When  the  stimulus  was  visible,  it  was  represented  by  activation  in  the \n\ntivation  in  the  On  channel  was  greater  than  its  initial  activation  in  the \nOn channel.  Thus,  the representation  carried  across  the  Off channel  helps \nmaintain the perceptual stability of the stimulus despite its being temporar(cid:173)\nily occluded  along parts of its trajectory. \n\nf'ffffTff Lay.,'  ~ \u2022. ( '.L ..... ; \u2022. .. ; ... ::::.\u00b7.\u00b7.\u00b7.\u00b7 \u2022. i .. '.;.; .. ': .. , .\u2022 \n\nfflIfff \n\n:<:P}::;\";;:;':'r \n\n' .. >::  ~::  :  . ~',  .\". '.\" \n.. \n\nLayer 1\n\n. \n\n-, \n\nFigure  3:  Simulated network operation  after learning.  The learning pr()c~d~'re cahses the  repre(cid:173)\nsentation of each  trajectory to split  into two parallel opponent channels.  The Visible  and  Invisible \nchannel  pair  for  a  single  trajectory  are  shown.  The  display  has  been  arranged  so  that  all  the \nVisible  channel  neurons  are  on  the same  row  (Layer 2,  lower  row);  likewise  the  Invisible  channel \nneurons (Layer  2,  upper row).  Solid  arrows indicate excitatory connections.  Gray arrows indicate \nlateral  inhibitory connections.  (Left)  The  network's responses to  an  unbroken  rightward  motion \nof the stimulus are shown.  The activities of the network at successive moments in  time have  been \ncombined into a single  network display;  each  horizontal position  in  the figure  represents a different \nmoment in  time as well  as  a different  position  in  the network. The stimulus successively activates \nmotion  detectors  (solid  circles)  in  Layer  1.  The activation  of  the  responding  neuron  in  the sec(cid:173)\nond  layer  builds  toward  an  asymptote,  reaching full  activation  by  the fourth  frame .  (Right)  The \nnetwork's responses to a broken (occluded) rightward motion sequence are shown.  When the stim(cid:173)\nulus  reaches the region  indicated by gray shading, it  disappears behind  a simulated occluder.  The \nnetwork  responds  by  successively activating  neurons in  the Invisible  channel.  When  the stimulus \nemerges from  behind  the occluder (end  of gray shading) , it  is  again  represented  by  activation  in \nthe Visible  channel. \n5  DISCUSSION \n5.1  PSYCHOPHYSICAL ISSUES AND  PREDICTIONS \n\nSeveral  visual  phenomena  (Burr,  1980;  Piaget,  1954;  Shimojo,  Silverman,  & \nNakayama,  1988)  support the notion that early processing mechanisms maintain a \ndynamic  representation  of temporarily occluded  objects for  some amount of time \nafter  they  disappear  (Marshall,  1991).  In  general,  the  duration  of such  represen(cid:173)\ntations  should  vary  as  a  function  of  many  factors,  including  top-down  cognitive \nexpectations,  stimulus complexity, and  Gestalt grouping. \n5.2  ALTERNATIVE  MECHANISMS \n\nAnother  model  besides  the Visible/Invisible  model  was  studied  extensively:  a \nVisible/Virtual system,  which  would develop some neurons that respond  to visible \nobjects  and  others that respond  to both  visible and  invisible objects  (Le.,  to  \"vir(cid:173)\ntual\"  objects).  There  is  a  functional  equivalence  between  such  a  Visible/.Virtual \nsystem  and  a  Visible/Invisible system:  the  same information about  visibllity  and \ninvisibility can be determined by examining the activations of the neurons.  Activity \nin a  Virtual channel neuron,  paired with inactivity in a corresponding Visible chan(cid:173)\nnel  neuron,  would indicate the presence  of an invisible stimulus. \n\n\fLearning to  Predict Visibility  and  Invisibility  from  Occlusion Events \n\n821 \n\n5.3  NEUROPHYSIOLOGICAL CORRELATES \n\nAssad and Maunsell (1995) recently described their remarkable discovery ofneu(cid:173)\n\nrons  in  macaque  monkey  posterior  parietal  cortex  that  respond  selectively  to the \ninferred motion of invisible stimuli.  This type of neuron responded  more strongly to \nthe disappearance and reappearance of a stimulus in a task where the stimulus'  \"in(cid:173)\nferred\"  trajectory  would  pass  through  the  neuron's  receptive  field  than  in  a  task \nwhere  the  stimulus would  disappear  and  reappear  in  the same  position.  Most  of \nthese  neurons  also had a strong off-response,  which  in the present  models is closely \ncorrelated  with  inferred  motion.  Thus,  the  results  of Assad  and  Maunsell  (1995) \nare  more  directly  consistent  with  the  Visible/Virtual  model  than  with  the  Visi(cid:173)\nble/Invisible model.  Although this paper describes  only one of these  models,  both \nmodels merit investigation. \n5.4  LEARNING  ASSOCIATIONS  BETWEEN  VISIBILITY  AND \n\nRELATIVE  DEPTH \n\nThe  activation of neurons  in the Off channels  is  highly correlated  with the ac(cid:173)\n\ntivation of other neurons elsewhere  in the visual system, specifically neurons whose \nactivation indicates the  presence  of other objects  acting  as  occluders.  Simple asso(cid:173)\nciative Hebb-type learning lets such occluder-indicator neurons and the Off channel \nneurons gradually establish reciprocal  excitatory connections  to each  other. \n\nAfter  such  reciprocal  excitatory  connections  have  been  learned,  activation  of \noccluder-indicator  neurons  at  a  given  spatial position causes  the  network  to favor \nthe  Off  channel  in  its  predictions  - i.e.,  to  predict  that  a  moving  object  will  be \ninvisible at that position.  Thus,  the network learns to use  occlusion information to \ngenerate better predictions of the visibility/invisibility of objects. \n\nConversely,  the activation of Off channel neurons causes the occluder-indicator \n\nneurons to receive  excitation. The disappearance of an object excites the represen(cid:173)\ntation  of an  occluder  at  that  location.  If the  representation  of the  occluder  was \nnot  previously  activated,  then  the  excitation  from  the  Off  channel  may  even  be \nstrong  enough  to  activate it  alone.  Thus,  disappearance  of moving visual  objects \nconstitutes evidence  for  the presence  of an inferred  occluder.  These  results  will  be \ndescribed  in a later paper. \n5.5  LIMITATIONS  AND  FUTURE WORK \n\nThe Visible/Invisible model presented  in  this paper describes  SOme  of the pro(cid:173)\n\ncesses  that  may  be  involved  in  detecting  and  representing  depth  from  occlusion \nevents.  There  are  other  major issues  that  have  not  been  addressed  in  this paper. \nFor example, how can the system handle real 2-D or 3-D objects, composed of many \nvisual features  grouped  together  across  space,  instead  of mere point stimuli?  How \ncan it handle partial occlusion of objects?  How can it handle nonlinear trajectories? \nHow  exactly  can  the  associative  links  between  occluding  and  occluded  objects  be \nformed?  How  can  it handle transparency? \n6  CONCLUSIONS \nPerception of relative depth from occlusion events is a powerful, useful,  but poorly(cid:173)\nunderstood  capability  of human  and  animal  visual  systems.  We  have  presented \nan  analysis  based  on  predictivity:  a  visual  system  that  can  predict  the  visibil(cid:173)\nity /invisibility of objects during occlusion events possesses (ipso facto)  a good repre(cid:173)\nsentation of relative depth.  The analysis implies that the representations for visible \nand invisible objects must be distinguishable.  We have implemented a model system \nin which distinct representations for visible and invisible features self-organize in re(cid:173)\nsponse to exposure to motion sequences containing simulated occlusion and disocclu(cid:173)\nsion events.  When a  moving feature fails to appear approximately where  and when \nit  is  predicted  to  appear,  the  mismatch between  prediction  and  the  actual  image \ntriggers an unsupervised  learning rule.  Over many motions, the learning leads to a \nbifurcation of a network layer into two parallel opponent channels of neurons.  Pre(cid:173)\ndiction signals in the network are carried along motion trajectories by specific chains \nof lateral excitatory  connections.  These chains  also cause the representation  of in(cid:173)\nvisible features to propagate for  a limited time along the features' trajectories.  The \nnetwork uses  shortfall (differencing)  and disinhibition to maintain grounding of the \nrepresentations of invisible features. \n\n\f822 \n\nJ. A. MARSHALL, R.  K.  ALLEY, R.  S.  HUBBARD \n\nAcknowledgements \nSupported in part by  ONR (NOOOl4-93-1-0208),  NEI  (EY09669),  a  UNC-CH Junior  Fac(cid:173)\nulty DeveloP!llent  Award, an ORAU Junior Faculty Enhancement  Award from  Oak Ridge \nAssociated  Universities,  the  Univ.  of Minnesota  Center for  Research in Learning,  Percep(cid:173)\ntion,  and Cognition,  NICHHD  (HD-07151),  and  the Minnesota  Supercomputer  Institute. \nWe  thank Kevin Martin, Stephen Aylward, Eliza Graves,  Albert Nigrin,  Vinay Gupta, \nGeor~e Kalarickal\", Charles SchmItt,  Viswanath Srikanth,  David Van Essen,  Christof Koch, \nand Ennio  Mingolla  for  valuable  discussions. \nReferences \nAssad  JA,  Maunsell  JHR (1995)  Neuronal  correlates  of inferred  motion in macaque  pos(cid:173)\n\nterior parietal  cortex.  Nature 373:518-521. \n\nBodnarenko  SR,  Chalupa  LM  (1993)  Stratification  of On and  Off ganglion  cell  dendrites \ndepends  on  glutamate-mediated  afferent  activity  in  the  developing  retina.  Nature \n364:144-146. \n\nBurr D  (1980)  Motion smear.  Nature  284:164-165. \nFrost  BJ  (1993)  Subcortical  analysis  of  visual  motion:  Relative  motion,  figure-ground \ndiscrimination and induced optic flow.  Visual Motion  and Its  Role in  the Stabilization \nof Gaze, Miles  FA,  Wallman  J  (Eds).  Amsterdam:  Elsevier Science,  159-175. \n\nHubbard  RS,  Marshall  JA  (1994)  Self-organizing  neural  network  model  of the  visual  in(cid:173)\n\nertia  phenomenon  in  motion  perception.  Technical  Report  94-001,  Department  of \nComputer Science,  University  of North Carolina at Chapel  Hill.  26  pp. \n\nKaplan  GA  (1969)  Kinetic  disruption  of optical  texture:  The  perception  of depth  at  an \n\nedge.  Perception  fj Psychophysics 6:193-198. \n\nKellman  PJ,  Cohen  MH  (1984)  Kinetic  subjective contours.  Perception  fj Psychophysics \n\n35:237-244. \n\nMarshall  JA  (1989)  Self-organizing  neural  network  architectures  for  computing  visual \ndepth  from  motion  parallax.  Proceedings  of the  International  Joint  Conference  on \nNeural Networks,  Washington  DC,  11:227-234. \n\nMarshall JA (1990a)  Self-organizing neural networks for perception of visual motion.  Neu(cid:173)\n\nral  Networks  3:45-74. \n\nMarshall  JA  (1990b)  A self-organizing  scale-sensitive  neural  network.  Proceedings  of the \n\nInternational Joint  Conference on  Neural  Networks, San Diego,  CA,  111:649-654. \n\nMarshall  JA  (1990c)  Adaptive  neural  methods  for  multiplexing  oriented  edges. \n\nIn(cid:173)\n\ntelligent  Robots  and  Computer  Vision  IX:  Neural,  Biological,  and  3-D  Methods, \nCasasent  DP  (Ed),  Proceedings  of the  SPIE 1382,  Boston,  MA,  282- 291. \n\nMarshall  JA  (1991)  Challenges  of vision  theory:  Self-organization  of neural  mechanisms \nfor  stable  steering  of  obiect-g!ouping  data  in  visual  motion  perception.  Stochastic \nand  Neural  Methods  in  Signal  Processing,  Image  Processing,  and  Computer  Vision, \nChen SS  (Ed),  Proceedings  of the  SPIE  1569,  San Diego,  CA,  200-215. \n\nMarshall  JA  (1992)  Unsupervised  learning  of contextual  constraints  in  neural  networks \n\nfor  simultaneous  visual  processing  of  multiple  objects.  Neural  and  Stochastic  Meth(cid:173)\nods  in  Image  and  Signal  Processing,  Chen SS  (Ed),  Proceedings  of  the  SPIE  1766, \nSan Diego,  CA, 84-93. \n\nMarshall  JA (1995)  Adaptive perceptual pattern recognition  by self-organizing neural net(cid:173)\n\nworks:  Context,  uncertainty,  multiplicity,  and scale.  Neural  Networks  8:335-362. \n\nMarshall  JA,  Alley  RK  (1993)  A  self-organizing  neural network that learns  to detect and \nrepresent visual depth from occlusion events.  Proceedings of the AAAI Fall Symposium \non  Machine  Learning and Computer  Vision, Bowyer  K,  Hall  L  (Eds),  70-74. \n\nMartin  KE,  Marshall  JA  (1993)  Unsmearing  visual  motion:  Development  of long-range \nhorizontal  intrinsic  connections.  Advances  in  Neural  Information  Processing  Sys(cid:173)\ntems,  5,  Hanson  SJ,  Cowan JD,  Giles  CL (Eds).  San Mateo,  CA:  Morgan Kaufmann \nPublishers,  417-424. \n\nNakayama  K,  Shimojo  S  (1992)  Experiencing  and  perceiving  visual  surfaces.  Science \n\n257:1357-1363. \n\nNakayama K,  Shimojo  S,  Silverman  GH  (1989)  Stereoscopic  depth:  Its relation  to image \nsegmentation, grouping, and the recognition of occluded objects.  Perception 18:55-68. \n\nParks T  (1965)  Post-retinal  visual  storage  American  Journal of Psychology 78:145-147. \nPiaget  J  (1954)  The  Construction of Reality in  the  Child.  New  York:  Basic  Books. \nRamachandran  VS,  Inada V,  Kiama G  (1986)  Perception of illusory  occlusion in apparent \n\nmotion.  Vision  Research  26:1741-1749. \n\nShimojo S,  Silverman GH,  Nakayama K  (1989)  Occlusion and the solution to the aperture \n\nproblem  for  motion.  Vision  Research  29:619-626. \n\nYonas  A,  Craton LG, Thompson WB  (1987)  Relative  motion:  Kinetic information for  the \n\norder of depth at an edge.  Perception  fj Psychophysics  41:53-59. \n\n\f", "award": [], "sourceid": 1164, "authors": [{"given_name": "Jonathan", "family_name": "Marshall", "institution": null}, {"given_name": "Richard", "family_name": "Alley", "institution": null}, {"given_name": "Robert", "family_name": "Hubbard", "institution": null}]}