{"title": "Receptive field structure of flow detectors for heading perception", "book": "Advances in Neural Information Processing Systems", "page_first": 149, "page_last": 156, "abstract": null, "full_text": "Receptive field  structure of flow  detectors \n\nfor  heading perception \n\nJaap  A.  B eintema \n\nDept.  Zoology &  Neurobiology \n\nRuhr University Bochum, Germany,  44780 \nbeintema@neurobiologie.ruhr-uni-bochum.de \n\nAlbert  V.  van den Berg \n\nDept.  of Neuro-ethology,  Helmholtz Institute, \n\nUtrecht University,  The Netherlands \n\na. v. vandenberg@bio.uu.nl \n\nMarkus Lappe \n\nDept.  Zoology &  Neurobiology \n\nRuhr University Bochum, Germany, 44780 \n\nlappe@neurobiologie.ruhr-uni-bochum.de \n\nAbstract \n\nObserver translation relative to the world  creates image flow  that \nexpands from the observer's direction of translation (heading)  from \nwhich  the  observer  can  recover  heading  direction.  Yet,  the  image \nflow  is often more complex, depending on rotation of the eye,  scene \nlayout  and  translation  velocity.  A  number  of  models  [1-4]  have \nbeen  proposed  on  how  the human  visual  system extracts  heading \nfrom  flow  in  a  neurophysiologic ally  plausible  way.  These  models \nrepresent heading by a  set of neurons  that respond to large image \nflow  patterns and receive input from motion sensed at different im(cid:173)\nage  locations.  We  analysed  these  models  to  determine  the  exact \nreceptive  field  of these  heading  detectors.  We  find  most  models \npredict that, contrary to widespread believe,  the contribut ing mo(cid:173)\ntion sensors have a preferred motion directed circularly rather than \nradially around the detector's preferred heading.  Moreover, the re(cid:173)\nsults suggest to look for  more refined structure within the circular \nflow,  such as bi-circularity or local motion-opponency. \n\nIntroduction \n\nThe  image  flow  can  be  considerably  more  complicated  than  merely  an  expanding \npattern of motion  vectors  centered on the heading direction  (Fig.  1).  Flow  caused \nby eye rotation (Fig. 1 b)  causes the center of flow  to be displaced  (compare Fig.  1a \nand c).  The effect of rotation depends on the ratio ofrotation and translation speed. \n\n\fA \n\nTranslational flow \n\nB \n\nRotational flow \n\nC \n\nCombined flow \n\n... \n....... \n\n.. 1J:  ~ \n\n... - ',,,, \n\" '\\ \n\\\\  ~1 \n..  -... \n...  4i-\n........  -\n...  - +-..,... \n..  +-\n.. \n+- ., .... \n... \n.. \n.. \n.. .,.. \n.... \n.. * \n:: \n\n.......  1 \n...... \n~ .. \"  \u2022 \n... \n'#  \"  ~ \n,. \n~ \n~\\ + \n.1, \n\nI' \n\n...... \n\n~ \n\n... 0 \n\n~ \n\n~ \n\n+-\n\n4 \n\n......  ... .....-1 \n\n,,~  t \n\nf~~~~;  <If:-\n/ /  ~  ........ \n.... \n.. \n...  -+~ \n---\n... \n\u2022 \u2022  \"t \n.,. \n... \n~ \n~~~ \n\n\"  II  .J + \u2022 \nJ \n\nS:: \n\n\u2022 \n\n\u2022 0 \n\n-+ \n\n....-.... +-....... \n\nFigure  1:  Flow  during  a)  observer translation through a  3D-cloud of dots,  headed \n10\u00b0  towards  the  left,  during  b)  observer  rotation  about  the  vertical  towards  the \nright,  and during c)  the combination of both. \n\nAlso, since  the image motions caused by translation depend on point distance and \nthe image motions caused by rotation do not, the combined movement results in flow \nthat is no longer purely expanding for  scenes containing depth differences  (Fig. lc). \nHeading detection can therefore not rely on a  simple extrapolation mechanism that \ndetermines the point of intersection of motion vectors. \n\nA  number  of  physiologically-based  models  [1-4]  have  been  proposed  on  how  the \nvisual system might arrive at a  representation of heading from  flow  that is  insensi(cid:173)\nt ive  to parameters other than  heading direction.  These models  assume  heading is \nencoded  by a  set  of units  that each  respond  best  to a  specific  pattern of flow  that \nmatches  their  preferred  heading.  Such  units  resemble  neurons  found  in  monkey \nbrain area MST. MST cells have large receptive fields  (RF), typically  covering one \nquart or more  of the  visual field,  and receive  input from  several  local  motion sen(cid:173)\nsors in brain area MT.  The receptive field  of MST neurons may  thus be defined as \nthe preferred location, speed and direction of all input  local motion sensors.  Little \nis  known  yet  about  the  RF  structure of MST  neurons.  We  looked for  similarities \nbetween  current models  at the level  of the  RF  structure.  First  we  explain the  RF \nstructure  of units  in  the  velocity  gain  model,  because  this  model  makes  clear  as(cid:173)\nsumptions on the RF structure.  Next, we  we  show the results of reconstructing RF \nstructure of units in the population model[2] .  Finally, we  analyse the RF structure \nof the template model[3]  and motion-opponency model[4]. \n\nVelocity gain field  model \n\nThe  velocity  gain  field  model[l]  is  based  on  flow  templates.  A  flow  template,  as \nintroduced  by  Perrone and  Stone[3] , is  a  unit that evaluates the evidence that the \nflow  fits  the  unit's  preferred  flow  field  by  summing  the  responses  of local  motion \nsensors outputs.  Heading is  then represented by the preferred heading direction of \nthe most active template(s) .  The velocity gain field  model[l]  is different from  Per(cid:173)\nrone and Stone's template model[2]  in the way it  acquires invariance for  translation \nspeed,  point  distances  and  eye  rotation.  Whereas  the  template  model  requires  a \ndifferent  template for  each  possible  combination of heading direction and rotation, \nt he velocity gain field  model obtains rotation invariance using far  less templates by \nexploiting eye  rotation velocity signals. \n\nThe  general  scheme  applied  in the velocity gain field  model  is  as  follows.  In  a  set \nof  flow  templates,  each  tuned  to  pure  expansion  with  specific  preferred  heading, \n\n\fA  Circular component \n\nB \n\nRadial  component \n\n<i!t-\n\n.... \n\n4t\"4-\n\n4-\n\n.... \n\nII \n\n-. \n\nl \n\n... \n~  ~  . ~ \n\n~ \n\n~ \n\n~ \n\n9'Y \n\n~ \n~~ \n.t. \n\u2022 \n\n.J' \n.......... \n\ny \n\nJr \n\n,\\  t \n,-' \n,//, ..  \u2022\u2022 \n\nt  r  1'f ;J? \n\"\" \n-+  -\n-\n-.-\n.. \n\n0 \n\" \n\nJ \n\n... ... \n\n\u2022 \n\n-...; \n\n~. \n\ny \n\n\\  .. \n~-. \n\n.~ \n\n... 0 \n\n9' \n\n\u2022 \n.... \n............ \n\n+-4-\n\n#= \n\nII \n\n9' \n\n,9' \n\n9'9' \u2022  + \n\"\" \n\u2022 \n\u2022\u2022\u2022 \n.. \n.. \n\u00a5s-\n\n\" \n\nII \n\n\u00a5 \n\nII \n\nFigure  2:  The  heading-centered  circular  (a)  and  radial  (b)  component  of the  flow \nduring combined translation and rotation as  in Fig.  2c. \n\nthe templates  would  change their  activity during eye  rotation.  Simply  subtracting \nthe rotation velocity signal for  each flow  template would not suffice to compensate \nbecause  each  template  is  differently  affected  by  rotational  flow.  However,  each \nflow  template  can  become  approximately  rotation-invariant by  subtracting  a  gain \nfield  activity that is  a  multiplication of the eye velocity t  with a derivative template \nactivity 80/ 8R that is specific for each flow template.  The latter reflects the change \nin  flow  template  activity  0  given  a  change  in  rotational flow  8R.  Such  derivative \ntemplate 80/ 8R  can  be  constructed  from  the  activity  difference  of two  templates \ntuned to the same  heading,  but opposite  rotation.  Thus,  in  the velocity gain  field \nmodel,  templates  tuned to  heading  direction  and a  component  of rotation play an \nimportant role. \n\nTo  further  appreciate the  idea  behind  the  RF  structure  in  the  velocity  gain  field \nmodel, note that the retinal flow  can be split  into a  circular and radial component, \ncentered on  the heading  point  (Fig.  2).  Translation at different  speeds  or  through \na  different 3D  environment will  alter the radial component only.  The circular com(cid:173)\nponent  contains  a  rotational  component  of flow  but  does  not  change  with  point \ndistances  or  translational  speed.  This  observation  lead  to  the  assumption  imple(cid:173)\nmented in the velocity gain field  model that templates should only measure the flow \nalong circles  centered on the point of preferred heading. \nAn example of the  RF  structure of a  typical unit  in  the velocity gain  field  model, \ntuned to heading and rightward rotation is shown in Fig.  3.  This  circular RF struc(cid:173)\nture strongly reduces sensitivity to variations in depth structure or the translational \nspeed, while the template's tuning to heading direction is preserved, because its pre(cid:173)\nferred structure is centered on its preferred heading direction  [1] .  Interestingly,  the \nRF structure of the typical rotation-tuned heading units is  bi-circular,  because the \ndirection  of circular flow  is  opponent  in  the hemifields  to either side  of an axis  (in \nthis  case  the  horizontal  axis)  through the  heading  point.  Moreover, the structure \ncontains a gradient in magnitude along the circle, decreasing towards the horizontal \naxis. \n\n\f..... \n\nI\\. \n\nIt \n\n0 \n\n1\\ \n\n\u2022 \n\nF: I\\. \n~  ... - ..... \n. ..  .r- I\\.  \u2022  \u2022  \u2022 \nI \n..  \u2022  \u2022 \nIt  \"'- .r \n.... - Of'  \"  .. \n.. \n....  -- ...  ~  \"  rJ \n\n..... \n\nFigure 3:  Bi-circular RF  structure of a  typical unit in the velocity gain field  model, \ntuned to leftward heading and simultaneous rightward rotation about the vertical. \nIndividual  vectors  show  the  preferred  direction  and  velocity  of the  input  motion \nsensors. \n\nPopulation model \n\nThe  population  model  [2]  derives  a  representation of heading  direction  that  is  in(cid:173)\nvariant to the other flow  parameters using a  totally  different approach.  This model \ndoes  not presume an explicit RF  structure.  Instead, the connections strengths and \npreferred  directions  of local  motion  inputs  to  heading-specific  flow  units  are  com(cid:173)\nputed according to an optimizing algorithm[5].  We here present the results obtained \nfor  a  restricted version of the model in which eye rotation is  assumed to be limited \nto pursuit that keeps  the eye  fixated  on a  stationary point in  the scene during the \nobserver translation.  Specifically,  we  investigated whether a  circular or  bi-circular \nRF  structure  as  predicted  by  the  velocity  gain  model  emerges  in  the  population \nmodel. \n\nThe  population  model  [2 ,6]  is  an  implementation  of  the  subspace  algorithm  by \nHeeger  and Jepson  [5] into  a  neural network.  The subspace algorithm computes a \nresidual  function  R(T j)  for  a  range of possible  preferred  heading  directions.  The \nresidual  function  is  minimized  when  flow  vectors  measured  at  m  image  locations, \ndescribed  as  one  array,  are  perpendicular  to  the  vectors  t hat  form  columns  of a \nmatrix C ~ (T j).  This matrix is computed from the preferred 3-D translation vector \nT j  and  the  m  image  locations.  Thus,  by  finding  the  matrix  that  minimizes  the \nresidue, the algorithm has solved the heading, irrespective of the 3D-rotation vector, \nunknown depths of points and translation speed. \n\nTo  implement the  subspace algorithm in  a  neurophysiologically plausible  way, the \npopulation model assumes two layers of units.  The first MT-like layer contains local \nmotion  sensors  that fire  linearly with  speed  and have  cosine-like  direction  tuning. \nThese sensors connect to units in the second MST-like layer.  The activity in a  2nd \nlayer  unit,  with  specific  preferred  heading  T j,  represents  the  likelihood  that  the \nresidual function  is  zero.  The  connection strengths are determined by the  C ~ (T j) \nmatrix.  As  not  to  have  too  many  motion  inputs  per  2nd  layer  unit,  the  residual \nfunction  R(T j) is  partitioned into smaller sub residues that take only  a  few  motion \ninputs.  The  likelihood for  a  specific  heading is  t hen  given by the sum of responses \nin  a  population with same preferred heading. \nGiven  the  image  locations  and  the  preferred  heading,  one  can reconstruct  the  RF \nstructure for  2nd layer units with the same preferred heading.  The preferred motion \ninputs  to  a  second  layer  unit  are  given  by  vectors  t hat  make  up  each  column  of \nC ~ (T j).  Hereby,  t he  vector  direction  represents  the  preferred  motion  direction, \n\n\fA \n\nB \n\n\"-\n\n? \n\n\\ \nt \nI \n\nFigure 4:  Examples of receptive field structure of a population that encodes heading \n100 towards the left  (circle) .  a-b)  Five  pairs of MT-like sensors,  where the motion \nsensors  of each  pair  are  at  a)  the  same  image  location,  or  b)  at  image  locations \none quarter of a  cycle apart.  c)  Distribution of multiple pairs leading to bi-circular \npattern. \n\nand the vector magnitude represents the strength of the synaptic connection.  The \nmatrix  C l..(Tj)  is  computed  from  the  orthogonal  complement  of a  (2m  x  m  + 3) \nmatrix C(Tj) [5].  On the assumption that only fixational eye movements occur, the \nmatrix reduces to  (2m x m + 1)[6].  Given only two flow  vector inputs  (m =  2),  the \nmatrix C l.. (T j) reduces to one column of length m  =  4.  The orthogonal complement \nof this 4 x 3 matrix was  solved in Mathematica by first  computing the nullspace of \nthe inverse matrix of C (T j), and then constructing an orthonormal basis for it using \nGram-Schmidt orthogonalisation.  We  computed  the orientation and magnitude of \nthe two MT-inputs analytically.  Instead of giving the mathematics, we here describe \nthe main results. \n\nCircularity \n\nIndependent  of the spatial arrangement of the two  MT-inputs to a  2nd-layer unit, \ntheir preferred motions turned out to be always directed along a  circle centered on \nthe  preferred  heading point.  Fig.  4  shows  examples  of the  circular  RF  structures, \nfor  different distributions of motion pairs that code for  the same heading direction. \n\nMotion-opponency \n\nFor pairs of motion sensors at overlapping locations, the vectors of each pair always \nturned out to be opponent and of equal magnitude  (Fig.  4a).  For pairs of spatially \nseparated motion sensors, the preferred magnitude and direction of the two motion \ninputs depend on their location with respect to the hemispheres divided by the line \nthrough  heading  and fixation  point.  We  find  that  preferred  motion  directions  are \nopponent if the pair is  located within the same hemifield,  but uni-directional if the \npair is  split across the two hemifields  as  in Fig.  4b. \n\nBi-circularity \n\nInterestingly, if pairs of motion sensors are split across hemi fields,  with partners at \nimage locations 90 0  rotated about the heading point, a magnitude gradient appears \nin the RF structure (Fig.  4b).  Thus, with these pairs a bi-circular RF structure can \nbe constructed similar to units tuned to rotation about the vertical in  the velocity \ngain field  model  (compare with Fig.  3). \nNote,  that  the  bi-circular  RF  structures  do  differ  since  the  axis  along  which  the \nlargest magnitude occurs is horizontal for  the population model and vertical for  the \nvelocity gain field  model.  The RF structure of the population model unit resembles \na  velocity  gain  field  unit  tuned  to  rotation  about  the  horizontal  axis,  implying  a \n\n\fAdapted from  Perrone and Stone (1994) \n\nEffective RF structure \n\nA \n\nDirection and  speed \ntuned motion sensors \n\nB \n\n\"\"\",\"\" .,\"',,\", ~ (j \n\nFigure  5:  Adapted  from  Perrone  and  Stone  1994).  a)  Each  detector  sums  the \nresponses  of  the  most  active  sensor  at  each  location.  This  most  active  motion \nsensor  is  selected from  a  pool  of sensors  tuned  to  different  depth  planes  (Ca,  Cb, \netc).  These  vectors  are  the  vector  sums  of  preferred  rotation  component  Rand \ntranslational components Ta, Tb, etc.  b)  Effective RF structure. \n\nlarge sensitivity to such rotation.  This, however, does not conflict with the expected \nperformance  of the  population  model.  Because  in  this  restricted  version  rotation \ninvariance is expected only for rotation that keeps the point of interest in the center \nof the image plane  (in this case rotation about the vertical because heading is  left(cid:173)\nward)  units are likely to be sensitive to rotation about the horizontal and torsional \naxis. \n\nTemplate model \n\nThe  template model  and  the  velocity  gain  field  model  differ  in  how  invariance for \ntranslation velocities, depth structure and eye rotation is  obtained.  Here,  we  inves(cid:173)\ntigate  whether  this  difference  affects  the  predicted  RF  structure.  In  the template \nmodel of Perrone and Stone [3],  a template invariant to translation velocity or depth \nstructure is  obtained by  summing the  responses  of the  most  active  sensor at each \nimage location.  This most active sensor is  selected from  a  collection of motion sen(cid:173)\nsors,  each tuned to a  different ego-translation speed  (or depth plane),  but with the \nsame preferred ego-rotation and heading direction  (Fig.  5a).  Given a large range of \ndepth planes,  it  follows  that a  different  radial  component  of motion  will  stimulate \nanother  sensor  maximally,  but  that  activity  nevertheless  remains  the  same.  The \ncontributing response will  change only due to a  component of motion along a  circle \ncentered  on  the  heading,  such  as  is  the  case  when  heading  direction  or  rotation \nis  varied.  Thus,  the  contributing  response  will  always  be  from  the  motion  sensor \noriented along the circle around the template's preferred heading.  Effectively, this \nleads to a bi-circular RF structure for  units tuned to heading and rotation (Fig. 5b). \n\nMotion-opponency model \n\nRoyden[4]  proposed that the effect  of rotation is  removed at local motion detection \nlevel  before  the motion  signals  are  received  by  flow  detectors.  This is  achieved  by \nMT-like sensors that compute the difference  vector between spatially neighbouring \nmotion vectors.  Such difference vector will  always be oriented along lines intersect(cid:173)\ning  at the heading  point  (Fig.  6).  Thus,  the  resulting input  to  flow  detectors  will \nbe oriented radially.  Indeed,  Royden's results[4]  show that the preferred directions \nof the operators with the largest response will  be radially, not  circularly,  oriented. \n\n\fA \n\nTranslational flow \n\nB \n\nRotational flow \n\nL-________ ~.~ ______ ~ \n\nFigure 6:  Motion parallax, the difference  vector between locally neighbouring  mo(cid:173)\ntion  vectors.  For  translation  flow  (a)  the  difference  vector  will  be  oriented  along \nline through the heading point, whereas for  rotational flow  (b)  the difference vector \nvanishes  (compare vectors within square). \n\nSummary and  Discussion \n\nWe  showed that a  circular RF structure, such as proposed by the velocity gain field \nmodel[l] , is  also  found  in the population model[2]  and  is  effectively  present in  the \ntemplate model[3]  as  well.  Only  the motion-opponent  model  [4]  prefers  radial  RF \nstructures.  Furthermore,  we  find  that  under  certain  restrictions,  the  population \nmodel  reveals  local  motion-opponency  and  bi-circularity,  properties  that  can  be \nfound  in the other models  as  well. \n\nA  circular  RF  structure  turns  out  to  be  a  prominent  property  in  three  models. \nThis supports the counterintuitive, but computationally sensible idea, that it is  not \nthe  radial  flow  structure,  but  the  structure  perpendicular  to  it,  that  contributes \nto  the  response  of  heading-sensitive  units  in  the  human  brain.  Studies  on  area \nMST  cells  not  only  report  selectivity  for  expanding  motion  patterns,  but  also  a \nsignificant  proportion of cells  that  are  selective  to  rotation patterns  [7-10].  These \nmodels  could explain why cells  respond so  well  to circular motion,  in  particular to \nthe high rotation speeds  (up  to about 80  deg/s)  not experienced in  daily life. \n\nThis  model  study  suggests  that  selectivity  for  circular  flow  has  a  direct  link  to \nheading detection mechanisms.  It also suggests that testing selectivity for expanding \nmotion  might  be  a  bad indicator  for  determining  a  cell's  preferred  heading.  This \npoint has been noted before,  as  MST seems to be systematically tuned to the focus \nof rotation, exactly like  model neurons  [9]. \n\nLittle  is  still  known  about  the  receptive  field  structure  of MST  cells.  So  far  the \nreceptive  field  structure of MST  cells  has  only  been  roughly  probed  [10],  and  the \nresults neither support a radial nor a circular structure.  Also, so far only uni-circular \nmotion  has  been  tested.  Our  analyses  points  out  that  it  would  be  worthwhile  to \nlook  for  more  refined  circular  structure  such  as  local  motion-opponency.  Local \nmotion  opponency  has  already  been  found  in  area MT,  where  some  cells  respond \nonly  if different  parts of their  receptive  field  are  stimulated  with  different  motion \n[11].  Another promising structure to look for  would be bi-circularity, with gradients \nin  magnitude of preferred motion along the circles. \n\n\fAcknowledgments \n\nSupported by the German Science Foundation and the German Federal Ministry of \nEducation and Research. \n\nReferences \n\n[1]  Beintema, J . A. & van den Berg A.  V.  (1998)  Heading detection using motion templates \nand eye  velocity gain fields.  Vision  Research,  38(14):2155-2179. \n\n[2]  Lappe M.,  &  Rauschecker J . P. (1993)  A neural network for  the processing of optic flow \nfrom ego-motion in man and higher  mammals.  Neural  Computation,  5:374-39l. \n\n[3]  Perrone  J.  A.  &  Stone  L.  S.  (1994)  A  model  for  the  self-motion  estimation  within \nprimate extrastriate visual cortex.  Vision  Research,  34:2917-2938. \n\n[4]  Royden  C.  S.  (1997)  Mathematical  analysis  of motion-opponent  mechanisms  used  in \nthe  determination  of  heading  and  depth.  Journal  of the  Optical  Society  of America  A, \n14(9):2128-2143. \n\n[5]  Heeger  D .  J .  &  Jepson  A.  D .  (1992)  Subspace  methods for  recovering  rigid  motion  I: \nAlgorithm and implementation.  International Journal  of Computational  Vision , 7:95-117. \n\n[6]  Lappe M.  &  Rauschecker J.P. (1993)  Computation of heading direction from optic flow \nin  visual  cortex.  In  C.L.  Giles,  S.J.  Hanson  and J.D.  Cowan  (eds.),  Advances  in  Neural \nInformation  Processing  Systems  5,  pp.  433-440.  Morgan Kaufmann. \n\n[7]  Tanaka  K.  &  Saito  H.  (1989)  Analysis  of  the  visual  field  by  direction,  expan(cid:173)\nsion/contraction,  and  rotation  cells  clustered  in  the  dorsal  part  of  the  medial  superior \ntemporal area of the macaque monkey  Journal  of Neurophysiology,  62(3):626-64l. \n\n[8]  Duffy  C.  J.  &  Wurtz  R.  H.  (1991)  Sensitivity  of  MST  neurons  to  optic  flow  stimuli. \nI.  A  continuum of response  selectivity  to  large-field  stimuli.  Journal  of Neurophysiology, \n65(6) :1329-1345. \n\n[9]  Lappe M.,  Bremmer F .,  Pekel M.,  Thiele A., Hoffmann K.-P.  (1996)  Optic flow  process(cid:173)\ning in monkey STS: a theoretical and experimental approach.  the Journal of Neuroscience, \n16(19):6265-6285. \n\n[10]  Duffy  C.  J.  &  Wurtz R.  H.  (1991)  Sensitivity of MST  neurons  to  optic flow  stimuli. \nII.  Mechanisms  of response  selectivity  revealed  by  small-field  stimuli.  Journal  of Neuro(cid:173)\nphysiology,  65(6):1346-1359. \n\n[11]  Allman J., Miezin F. & McGuinness E.  (1985)  Stimulus specific responses from beyond \nthe classical  receptive  field:  Neurophysiological  mechanisms  for  local-global  comparisons \nin  visual neurons.  Ann.  R ev.  N eurosci.,  8:407-430. \n\n\f", "award": [], "sourceid": 2034, "authors": [{"given_name": "J.", "family_name": "Beintema", "institution": null}, {"given_name": "M.", "family_name": "Lappe", "institution": null}, {"given_name": "Alexander", "family_name": "Berg", "institution": null}]}