Submitted by Assigned_Reviewer_1
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
The proposed method to model preplay and rapid path planning using successor representation is a very interesting and potentially important development. It is also in my opinion a more realistic implementation of model-based RL than tree search-based alternatives.
However, a few important questions remain: - is computation of advanced quantities as described in sections 2 and 3 biologically realistic - not just from the perspective of the resulting output and dynamics, but the mechanistic implementation (e.g. how weights that are derived here using least-squares learned)? - how does such successor-based representation work in more realistic scenarios when familiarity with the environment is partial and how does it change with learning - can it reproduce typically observed behavioral learning curves? - can it also be applied to other domains than space, e.g. time cells? I'm aware that addressing these questions in detail may be out of the scope of this paper, but it would be helpful if these aspects could at least be discussed briefly.
Clarity of the paper, particularly of the mathematical expressions and derivations, could be improved, e.g. line 2 of eq.3 or why a weak stimulus corresponds to alpha = epsilon = 0.05 explained in more detail.
Q2: Please summarize your review in 1-2 sentences
It is an important study linking different computational ideas such as successor representation and slow feature analysis to recent experimental data of sequence preplay and rapid path planning.
Submitted by Assigned_Reviewer_2
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
Technical comments:
- Equation 2, should be -d^2, not -d? ...Otherwise it's not a gaussian bump
- Your comment on line 119 that the normalized metric can be interpreted as a steady state probaility distribution is not strictly correct.
The probability of the steady state is proportional to the number of random walks that end in that state, which is related to, but not quite the same as the shortest distance from the initial state.
If you meant this in a loose way, then I'm okay with but you should make it clear that it is an approximation.
Q2: Please summarize your review in 1-2 sentences
This paper uses a simple model of CA3 hippocampus to perform planning through a maze task.
They do this by using a successor representation of the maze and initializing the netwrk in a starting location.
By providing a small constant input of the goal state the network representation moves towards the goal state, even if the goal is many steps away, and it takes the shortest path respecting any obstacles that might be in the way.
I like the task - and feel strongly that an easily updateable value map is something that is needed in the field of path planning.
I recommend that this paper be included in the conference.
Although I would have liked to see the limits of this method.
e.g. how complex a maze can be learnt and decoded with this method, before the method breaks.
Submitted by Assigned_Reviewer_3
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
The paper studied an important question, that is, how does the brain achieve rapid path integration in reality? The authors first presented a mathematical framework, the successor representation, for path-finding, and then demonstrated how this can be implemented in practice by using an artificial bump-like attractor network.
Mathematically, this is a quite interesting work. My main concern is about its biological relevance, although the authors argued that the involved complicated mathematical operations, such as extracting the dominant eigenvectors, can be implemented in practice.
Q2: Please summarize your review in 1-2 sentences
The paper proposed a mathematical model for rapid path intergration based on attractor network dynamics.
Submitted by Assigned_Reviewer_4
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
The paper introduces a novel potential candidate for a model of place-cell firing in hippocampus that implements (or contributes to) path finding in maze-like environments based on a spatial representation called successor representation.
The only short coming of the work that it doesn't make any predictions that can be tested experimentally. For instance it requires a very specific input pre-processing, i.e. the mapping to successor representation. Can one come up with predictions for the remapping of place cell firing maps based one changes of the maze (e.g. a door closes) or the size of the environment?
Clarity is fine (typo on page 2 line 059).
Q2: Please summarize your review in 1-2 sentences
Strong paper that contributed significantly to a hot topic by providing a link between experimental data on place cell behaviour and path planning implemented by attractor models.
Q1:Author
rebuttal: Please respond to any concerns raised in the reviews. There are
no constraints on how you want to argue your case, except for the fact
that your text should be limited to a maximum of 5000 characters. Note
however, that reviewers and area chairs are busy and may not read long
vague rebuttals. It is in your own interest to be concise and to the
point.
We thank the reviewers for their valuable comments
and addressing some important points that we wish to
clarify.
REVIEWER 1
Thank you for raising several important
questions. We will include 1-2 sentences in the discussion on existing
rules that address learning the least-squares optimal weights in the
recurrent network, and a note on how our model could combine with a slower
(e.g. model-free) system to reproduce the learning curves at different
levels of environmental familiarity in [4]. We will remove line 2 of Eq. 3
(defining pi in the text) and add a comment that we set alpha = epsilon to
maintain a similar mean level of activity in the network throughout the
stimulation (balancing input and decay).
REVIEWER 2
In
regards to biological relevance, we believe that our model gives a
testable hypothesis for the role that large place fields play in
generating goal-directed trajectories (please see the prediction response
to reviewer 4 below).
As a biologically plausible means of
extracting the dominant eigenvectors, slow feature analysis has been
proposed as a natural outcome of an STDP-like learning rule [Sprekeler et
al, PloS Comp. Bio., 2007], albeit on the timescale of a standard EPSP
rather than the behavioural timescale we consider here. However, STDP can
be extended to behavioural timescales when combined with sustained firing
and slowly decaying potentials [Drew & Abbott, PNAS, 2006] of the type
observed on the single-neuron level in the input pathway to CA3 [Larimer
& Strowbridge, Nature Neuroscience, 2010], or as a result of network
effects. We will address these potential mechanisms for learning slow
features / eigenvectors in the revised manuscript.
REVIEWER
4
We agree that it would be beneficial to add a short description
of the predictions of our model to the discussion. In particular, the
model proposes that large-scale attractor dynamics in ventral/intermediate
CA3 enable long-distance goal-directed preplay activity. This suggests,
for instance, that long-distance sequences in dorsal hippocampus (where
place fields are much smaller) must be inherited from dynamics in ventral
or intermediate hippocampus. In [6] it was shown that selectively sparing
the dorsal hippocampus resulted in impaired rapid path planning; in this
case, we would also predict a significant reduction in goal-directed
preplay activity in the remaining dorsal region. In an intact hippocampus,
we would predict that long-distance goal-directed preplay in the dorsal
hippocampus is preceded by preplay tracing a similar path in intermediate
hippocampus.
Concerning changes to the environment: [14] focused
on arguing that changes in place field representations due to training
protocols and alterations to the environment are well-described by coding
based on the successor representation, and we therefore did not focus on
this argument in the current work.
The typo will be corrected,
thank you.
REVIEWER 5
For the affinity, d will be corrected
to d^2, thank you.
Line 119 states that "The normalized metric can
be interpreted as a transition probability...", not a steady-state
probability. We will remove line 2 of Eq. 3 and adjust the text below that
to clarify the definition. |