{"title": "The Role of MT Neuron Receptive Field Surrounds in Computing Object Shape from Velocity Fields", "book": "Advances in Neural Information Processing Systems", "page_first": 969, "page_last": 976, "abstract": null, "full_text": "The Role of MT Neuron Receptive Field \n\nSurrounds in Computing Object Shape from \n\nVelocity Fields \n\nG.T.Buracas & T.D.Albright \n\nVision Center Laboratory,  The Salk Institute, \n\nP.O.Box 85800, San Diego, California 92138-9216 \n\nAbstract \n\nThe goal of this work was to investigate the role of primate \nMT  neurons  in  solving  the  structure  from  motion  (SFM) \nproblem.  Three  types  of  receptive  field  (RF)  surrounds \nfound in area MT neurons (K.Tanaka et al.,1986; Allman et \nal.,1985) correspond, as our analysis suggests, to the oth,  pt \nand 2nd  order fuzzy  space-differential operators.  The large \nsurround/center \nboth \ndifferentiation  of  smooth  velocity  fields  and  discontinuity \ndetection  at  boundaries  of  objects. \nThe  model  is  in \nagreement  with  recent  psychophysical  data  on  surface \ninterpolation involvement  in  SFM.  We  suggest  that  area \nMT  partially  segregates  information  about  object  shape \nfrom  information  about  spatial  relations  necessary  for \nnavigation and manipulation. \n\nradius \n\nallows \n\nratio \n\n(;::: \n\n7) \n\n1 INTRODUCTION \n\nBoth  neurophysiological  investigations  [8]  and  lesioned  human  patients' \ndata  show  that  the  Middle  Temporal  (MT)  cortical  area  is  crucial  to \nperceiving three-dimensional shape in moving stimuli.  On the other hand, \n\n969 \n\n\f970 \n\nBuracas and Albright \n\na  solid body of data (e.g.  [1])  has been gathered about functional properties \nof neurons  in the  area  MT.  Hoever,  the  relation  between  our  ability  to \nperceive  structure  in  stimuli,  simulating  3-D  objects,  and  neuronal \nproperties  has  not  been  addressed  up  to  date.  Here  we  discuss  a \npossibility,  that area MT  RF  surrounds might be  involved  in shape-from(cid:173)\nmotion perception.  We  introduce a  simplifying model  of MT  neurons  and \nanalyse the implications to SFM problem solving. \n\n2 REDEFINING THE SFM PROBLEM \n\n2.1  RELATIVE MOTION AS A CUE FOR RELATIVE DEPTH \n\nSince Helmholtz  motion parallax is  known to  be  a  powerful  cue providing \ninformation about both the structure of the surrounding environment and \nthe direction of self-motion.  On the other hand, moving objects also induce \nvelocity fields allowing judgement about their shapes.  We can capture both \ncases  by  assuming  that  an  observer  is  tracking  a  point  on  a  surface  of \ninterest.  The velocity field of an object then is (fig. 1): V = t z  + W  x (R - Ro) \n=-tz+wxz,  where  w=[wx,wy,O]  is  an  effective  rotation  vector  of a  surface \nz=[x,y,z(x,y)];  Ro=[O,O,zo]  is a  positional vector of the fixation  point;  t z is  a \ntranslational component along Z axis. \n\nz \n\nFig.l: The coordinate system assumed in this paper.  The origin is set at \n\nthe fixation point.  The observer is at Zo distance from a surface. \n\n\fThe Role of MT Neuron Receptive Field Surrounds in Computing Object Shape \n\n971 \n\nThe  component  velocities  of  a  retinal  velocity  field  under  perspective \nprojection can be calculated from: \n\n2 \n-xtz - WxXY+WyX \n\n(Zo  + Z)2 \n\nWxZ \nV=-\"--\nZo  +Z \n\n2 \n-ytz +WyXY-W xy \n\n(Zo  + Z)2 \n\nIn natural viewing conditions the distance to the surface Zo is usually much \nlarger  than  variation  in  distance  on  the  surface  z  :  zo\u00bbz.  In  such  the \nsecond term in the  above  equations  vanishes.  In  the  case  of translation \ntangential  to  the  ground,  to  which  we  confine  our analysis,  w=[O,wy,O]  = \n[O,w,O],  and the retinal velocity reduces to \n\nu  = -wz/(zo+z) :::::  -wz/zo ' \n\nv=O \n\n(1). \n\nThe latter relation allows the assumption of orthographic projection, which \napproximates the retinal velocity field rather well within the central 20 deg \nof the visual field. \n\n2.2  SFM PERCEPTION INVOLVES SURFACE INTERPOLATION \n\nHuman  SFM  perception  is  characterized  by  an  interesting  peculiarity  -(cid:173)\nsurface  interpolation  [7].  This  fact  supports  the  hypothesis  that  an \nassumption of surface continuity is embedded in visual  system.  Thus,  we \ncan  redefine  the  SFM  problem  as  a  problem  of  characterizing  the \ninterpolating  surfaces.  The  principal  normal  curvatures  are  a  local \nmeasure of surface invariant with respect to translation and rotation of the \ncoordinate system.  The orientation of the surface (normal  vector)  and its \ndistance  to  the  observer  provide  the  information  essential  for  navigation \nand  object  manipulation.  The  first  and  second  order  differentials  of  a \nsurface function allow recovery of both surface curvature and orientation. \n\n3  MODEL OF AREA MT RECEPTIVE FIELD SURROUNDS \n\n3.1  THREE TYPES OF RECEPTIVE FIELD SURROUNDS \n\nThe  Middle  Temporal  (MT)  area  of  monkeys  is  specialized  for  the \nsystematic representation  of direction  and velocity  of visual  motion  [1,2]. \nMT neurons are known to posess large, silent (RFS, the \"nonclassical RF\". \nBorn and Tootell  [4]  have very recently reported that the RF surrounds of \nneurons in owl monkey MT can be divided into antagonistic and synergistic \ntypes (Fig.2a). \n\n\f972 \n\nBuracas and Albright \n\na) \n\n25 \n~2O \n~ 15 \n~10 \nc.  5 \nen \n\no~~----~------~ \n20 \n\n10 \n\no \n\nAnnlJus diameter deg \n\nb) \n\n1  V \n\nIII  Ql  0.8 \n>  (II \n,- c:  06 \n..  c \n. \n\u00a3I  Q.  4 \nGi \n(II  O. \na:  2!  0.2 \n0 \n0.1 \n\n1 \n\n10 \n\nR otio of CIS  speeds \n\nFig.2:  Top  left  (a):  an  example  of  a \nsynergistic  RF  surround,  redrawn  from \n[4]  (no  velocity  tuning  known).  Bottom \nleft (b):  a  typical V-shaped tuning curve \nfor  RF  surround  The  horizontal  axis \nrepresents the logarithmic  scale  of ratio \nbetween  stimulus  speeds  in  the  RF \ncenter and  surround,  redrawn  from  [9]. \nBottom  (c,d):  monotonically  increasing \nand  decreasing  tuning  curves  for  RF \nsurrounds, redrawn from [9]. \n\nc) \n\nc:t \n\n1  J \n\nQl  Ql  0.8 \n>  til \n'';  ~  0.6 \n\u00a3I  Q. \ntil  0.4 \nQj \na:  2!  0.2 \n0 \n0.1 \n10 \nR otIo of CIS  speeds \n\n1 \n\n1 \n\nQ8  ~ \n\n06 \n04 \n02 \n0 \n01 \n\n10 \n\n1 \n\nRatootCS speeds \n\nAbout 44%  of the owl  monkey neuron RF8s recorded by Allman et al.  [3] \nshowed antagonistic properties. Approximately 33%  of these demonstrated \nV(or  U)-shaped  (Fig.2b),  and  66%  - quasi-linear  velocity  tuning  curves \n(Fig.2c,d).  One  half  of  Macaca  fuscata  neurons  with  antagonistic  RF8 \nfound  by  Tanaka et  al  [9]  have  had V(U)-shaped  velocity  tuning  curves, \nand  50%  monotonically  increasing  or  decreasing  velocity  tuning  curves. \nThe  RF8  were  tested  for  symmetry  [9]  and  no  asymmetrical  surrounds \nwere found in primate MT. \n\n3.2  CONSTRUCTING IDEALIZED MT FILTERS \n\nThe surround (8)  and center (C)  responses seem to be largely independent \n(except for the requirement that the velocity in the center must be nonzero) \nand seem to combine in an additive fashion [5].  This property allows us to \ncombine  C  and 8  components in our model  independently.  The  resulting \nfilters can be reduced to three types, described below. \n\n3.2.1  Discrete Filters \n\nThe  essential  properties  of the  three  types  of  RF8s  in  area  MT  can  be \ncaptured  by  the  following  difference  equations.  We  choose  the  slopes  of \nvelocity tuning curves in the center to be equal to the ones in the surround; \nthis is essential for obtaining the desired properties for 12  but not 10,  The 0-\norder (or low-pass) and the 2nd order (or band-pass) filters are defined by: \n\n\fThe Role of MT Neuron Receptive Field Surrounds in Computing Object Shape \n\n973 \n\ni \n\nj \n\ni \n\nj \n\nwhere g is gain, Wij  =1,  ije [-r,r]  (r = radius of integration).  Speed scalars \nu(iJ)  at points [ij] replace the velocity vectors V  due to eq.  (1).  Constants \ncorrespond to spontaneous activity levels. \n\nIn order to  achieve the V(U)  -shaped tuning for the surround in Fig.2b,  a \nnonlinearity has to be introduced: \n\nII  = gl L L (u e  - Us  (i,j))2 + Constl.  (3) \n\ni \n\nj \n\nThe responses of 11  and 12  filters to standard mapping stimuli used in [3,9] \nare plotted together with their biological correlates in Fig.3. \n\n3.2.2  Continuous analogues of MT filters \n\nWe  now  develop  continuous,  more  biologicaly  plausible,  versions  of  our \nthree  MT  filters.  We  assume  that  synaptic  weights  for  both  center  and \nsurround regions  fall  off with distance  from the  RF  center as  a  Gaussian \nfunction G(x,y,O'),  and 0' is different for center and surround: O'c  7; O's.  Then, \nby convolving with Gaussians equation (2) can be rewritten: \n\nLo (i,j) = u(i, j)* G( 0' e) + u(i,j)* G( 0' s ), \nL~ (i, j ) = \u00b1 [u ( i , j ) * G ( 0' e  ) - U ( i, j) * G ( 0' s  )]. \n\nThe continuous nonlinear Ll filter can be defined if equivalence to 11  (eq.  3) \nis observed only up to the second order term of power series for u(ij): \nLI (i, j) = U 2 (i, j  ) * G ( 0' e  ) + U 2 (i, j ) * G ( 0' s  ) - C . [ u ( i , j ) * G ( 0' e  )]. [u ( i , j ) * G ( 0' s )]; \nu2(ij)  corresponds  to  full-wave  rectification  and  seems  to  be  common  in \narea VI complex neurons;  C  =  2IErf2(nl2112) is  a  constant,  and Erf()  is  an \nerror function. \n\n3.3 THE ROLE OF MT NEURONS IN SFM PERCEPTION. \n\nand \n\nabove \n\nthe \n\ntruncating \n\nExpanding  z(x,y)  function  in  (1)  into  power  series  around  an  arbitrary \nyields: \npoint \nu(x,y)=w(ax2+by2+cxy+dx+ey+Olzo,  where \nexpansion \ncoefficients.  We  assume that w  is  known (from  proprioceptive  input)  and \n=1.  Then  Zo  remans  an  unresolved  scaling  factor  and  we  omit  it  for \nsimplicity. \n\nsecond  order \n\na,b,c,d,e,f \n\nare \n\nterm \n\n\f974 \n\nBuracas and Albright \n\nDATA \n\nMODEL \n\n0 \n\n0.5  V  J L, \nL+, \n0.5  J \n0.5  ~ 0 \n\n/\n\n0 \n\n1/4  112  I \n\n2  4 \n\n1/4  112  I  2  4 \n\nFig.  3:  The  comparison  between  data \n[9]  and  model  velocity  tuning  curves \nfor  RF  surrounds. \nThe  standard \nmapping stimuli (optimaly  moving  bar \nin  the  center  of  RF,  an  annulus  of \nrandom dots  with varying  speed)  were \napplied  to  L1  and  L2  filters.  Thee \noutput  of  the \nfilters  was  passed \nthrough a  sigmoid transfer function to \naccout for  a  logarithmic compresion in \nthe data. \n\nFig. 4:  Below, left: the response profile \nof the  L1  filter  in  orientation  space  (x \nand y axes represent the components of \nnormal  vector).  Right:  the  response \nprofile  of  the  L2  filter  in  curvature \nspace.  x  and y  axes represent the two \nnormal principal curvatures. \n\n~ L2 \n\nSurround/Center speed ratio \n\nL2 response in curvature space \n\n-15 \n\n-10 \n\n-5 \n\no \n\n5 \n\n10 \n\n15 \n\n-15 \n\n-10 \n\n\u00b75 \n\no \n\n5 \n\n10 \n\n15 \n\nApplying  Lo  on  u(x,y),  high  spatial  frequency  information  is  filtered  out, \nbut  otherwise  u(x,y)  does  not  change,  i.e.  Lo*u  covaries  with  lower \nfrequencies ofu(x,y).  L2  applied on u(x,y) yields: \n\nL2 * U = (2 a + 2 b ) C 2 (0' ~ - 0'; ) = C 2  ( 0' ~ - 0'; ) V 2 U , \n\n(4) \n\nthat is, L2  shows properties of the second order space-differential operator -\nLaplacian; C2(O'c2 - 0'82) is  a  constant depending only on the widths of the \ncenter  and  surround  Gaussians.  Note  that  L2*u  ==  1<:1  +  1<:2 \n'  (1<:12  are \nprincipal normal curvatures) at singular points of surface z(x,y). \n\n' \n\n\fThe Role of MT Neuron Receptive Field Surrounds in Computing Object Shape \n\n975 \n\nWhen applied on planar stimuli up(x,y) = d  x  + e  y,  L1  has properties of a \nsquared first order differential operator: \n\n~ * up  = (d 2  +e 2 )C, (a~ -a;) =  C, (a~ -a; >( (!)2 +( ~)2 )up,  (5) \n\nwhere C2(O'e2  - O's2) is a function of O'e  and O's  only.  Thus the output of L1  is \nmonotonically related to the norm of gradient vector.  It is straightforward \nto calculate the generic second order surface based on outputs of three Lo, \nfour L1  and one L2 filters. \n\nPlotting the responses of  L1  and L2  filters  in orientation and  curvature \nspace  can help to  estimate the role they play in solving the  SFM  problem \n(FigA).  The iso-response lines in the plot reflect the ambiguity of MT filter \nresponses.  However,  these  responses  covary  with  useful  geometric \nproperties of surfaces -- norm of gradient (L1) and mean curvature (L2). \n3.4  EXTRACTING VECTOR QUANTITIES \n\nEquations  (4)  and  (5)  show,  that  only  averaged  scalar  quantities  can  be \nextracted by  our MT  operators.  The  second order directional  derivatives \nfor  estimating vectorial quantities can be computed using an oriented RFs \nwith  the  following  profile:  02=G(x,O's)  [G(y,O's)  - G(y'O'e)).  01  then  can  be \ndefined  by the center - surround relationship of L1  filter.  The  outputs of \nMT  filters  L1  and  L2  might  be  indispensible  in normalizing  responses  of \noriented filters.  The  normal  surface  curvature  can  be  readily  extracted \n\nusing combinations of MT and hypothetical \u00b0 filters.  The oriented spatial \n\ndifferential  operators  have  not  been  found  in  primate  area  MT  so  far. \nHowever,  preliminary data from  our lab indicate  that elongated RFs  may \nbe present in areas FST or MST [6). \n\n3.5  L2: LAPLACIAN VS. NAKAYAMA'S CONVEXITY OPERATOR \n\nThe physiologically tested ratio of standard deviations  for  center and sur(cid:173)\nround  Gaussians  O'/O'e  ;:::  7.  Thus,  besides  performing  the  second  order \ndifferentiation in the low  frequency  domain,  L2  can  detect  discontinuities \nin optic flow. \n\n4. CONCLUSIONS \nWe  propose  that  the  RF  surrounds  in  MT  may  enable  the  neurons  to \nfunction as differential operators.  The described operators  can be  thought \nof  as  providing  a  continuous  interpolation  of  cortically  represented \nsurfaces. \nOur  model  predicts  that  elongated  RFs  with  flanking  surrounds  will  be \nfound (possibly in areas FST or MST [6]). These RFs would allow extraction \n\n\f976 \n\nBuracas and Albright \n\nof the directional derivatives necessary to estimate the principal curvatures \nand the normal vector of surfaces. \n\nFrom  velocity  fields,  area  MT  extracts  information  relevant  to  both  the \n\"where\" stream (motion trajectory, spatial orientation and relative distance \nof surfaces) and the \"what\" stream (curvature of surfaces). \n\nAcknowledgements \n\nMany  thanks  to  George  Carman,  Lisa  Croner,  and  Kechen  Zhang  for \nstimulating  discussions  and Jurate  Bausyte  for  helpful  comments  on  the \nposter.  This  project  was  sponsored  by  a  grant  from  the  National  Eye \nInstitute to TDA and by a  scholarship from the Lithuanian Foundation to \nGTB.  The  presentation  was  supported  by  a  travel  grant  from  the  NIPS \nfoundation. \n\nReferences \n\n[1]  Albright, T.D.  (1984) Direction and orientation selectivity of neurons in \nvisual area MT of the macaque. J.  Neurophysiol., 52: 1106-1130. \n\n[2]  Albright,  T.D.,  R.Desimone.  (1987)  Local  precision  of  visuotopic \norganization in the middle temporal area (MT)  of the macaque. Exp.Brain \nRes., 65, 582-592. \n\n[3]  Allman,  J.,  Miezin,  F.,  McGuinnes.  (1985)  Stimulus  specific  responses \nfrom beyond the classical receptive field. Ann.Rev.Neurosci., 8, 407-430. \n\n[4]  Born R.T. &  Tootell R.B.H. (1992) Segregation of global and local motion \nprocessing in primate middle temporal visual area. Nature, 357, 497-499. \n[5]  Born  R.T.  &  Tootell  R.B.H.  (1993)  Center  - surround  interactions  in \ndirection - selective  neurons of primate  visual  area MT.  Neurosci.  Abstr., \n19,315.5. \n\n[6]  Carman G.J., unpublished results. \n\n[7]  Hussain M.,  Treue S.  &  Andersen R.A.  (1989)  Surface interpolation in \nthree-dimensional Structure-from-Motion perception. Neural Computation, \n1,324-333. \n\n[8]  Siegel,  R.M.  and  R.A.  Andersen.  (1987)  Motion  perceptual  deficits \nfollowing ibotenic acid lesions of the middle temporal area in the behaving \nrhesus monkey. Soc.Neurosci.Abstr., 12, 1183. \n\n[9]Tanaka,  K.,  Hikosaka,  K.,  Saito,  H.-A., Yukie,  M.,  Fukada, Y.,  Iwai,  E. \n(1986) Analysis of local and wide-field movements in the superior temporal \nvisual areas of the macaque monkey. J.Neurosci., 6,134-144. \n\n\f", "award": [], "sourceid": 867, "authors": [{"given_name": "G.", "family_name": "Buracas", "institution": null}, {"given_name": "T.", "family_name": "Albright", "institution": null}]}