Reviews: LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition

This paper presents research on resource efficient video analysis. The reviewers appreciate the frame gating approach and solid methodology to perform a dynamic decision on inference-time resources that should be used for classifying an input video. The model is fully differentiable (via Gumbel-softmax), in contrast to RL-based approached for learning similar frame-skipping methods. The reviewers also note that the empirical evaluation is solid, with good comparisons to baselines/ablation studies. While there are some concerns regarding the magnitude of the contributions (e.g. relevance with LSTM-based models vs. other temporal deep learning architectures) and novelty, on balance it is a solid, well-written paper that makes a clear contribution to efficient video analysis.

Paper ID:	4216
Title:	LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition