Paper was reviewed by four expert reviewers, with initial scores of: 6, 6, 6, 5. Reviewers acknowledge a commendable improvements of the proposed approach on a difficult and novel task. A number of issues where raised about the paper, including (1) poor exposition and language [all reviewers], (2) lack of comparison to FiLM [R1], (3) specificity of task and dataset [R2,R4], among others. Authors provided a rebuttal that was discussed by reviewers and ultimately convincing. Two of the reviewers upgraded their scores, resulting in unanimously positive, albeit marginally so, scores of: 7, 6, 6, 6. AC, despite having reservations about quality of the writing, mentioned by all reviewers, agrees that the approach is valuable and presents a significant improvement over state-of-the-art on a relatively unexplored problem. As such AC agrees with reviewers that the paper should be accepted. Authors are asked to make a significant attempt to improve the writing and the language for the camera ready.