This paper presents a method for synthesizing symbolic policies from videos of demonstrations. It learns to map from the video frames to a set of predicates. A second stage program synthesis algorithm then searches for a program in a DSL that is consistent with the sequence of predicates. There is mostly a consensus that the problem setting is interesting, and the approach is a sensible-but-heuristic piecing together of a standard neural component with a standard program synthesis component. Thus, the strength of the paper is not in the methodology, and there were some concerns about whether this kind of contribution is sufficient for NeurIPS. However, the overall problem framing is interesting in its ability to go all the way from perceptual input (video demonstration) to symbolic program representations. On the strength of this as an interesting conceptual approach and the fact that the paper successfully executes on the approach, it seems worth accepting.