The paper proposes a novel reinforcement learning approach to solving the capacitated vehicle routing problem. It involves learning a value function and solving a TSP for the prizing problem. Reviewers agree that the proposed approach is novel and interesting. One reviewer is sceptical of the work because of doubts about the performance achievable with the proposed approach. However, the ideas presented still deserve to be presented at NeurIPS, with the hope of bringing advances to this research area. We urge the authors to better reflect the current limitations of their work, including a discussion on the comparison to OR-Tools and the state of the art in CVRP, including the references given in the reviews. Especially, we urge to include an enhanced comparison with Kool  in the final version.