This paper develops a higher-Markov-order convolutional LSTM based on tensor train decomposition, with applications to spatio-temporal activity analysis in videos. The reviews were mixed but marginally positive on average, and the scores increased slightly following the rebuttal and some discussion.There is a consensus that the approach is novel and interesting. The main criticism is that despite the extensive experiments it remains unclear whether it is novel formulation itself that is producing the observed improvements, or the many other points that differ relative to the baselines. The advantages of using Markov order>1 in this application also need to be clarified. Overall, the AC and SAC agreed that this was above threshold for NeurIPS. However the final version needs to do its best to address the concerns of the reviewers.