NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The paper is concerned with learning a general exploration policy, trained using reinforcement learning and considering a distribution of graph-structured environments. A motivating application is coverage-guided program testing (fuzzing). The paper generated quite some discussion among the reviewers, concerning the novelty compared to recent works, and the relevance of the contribution. The rebuttal did a very good job in clarifying the novelty issue, making it clear that the goal is similar to a touring problem (visiting all states, i.e. branches in the program graph, based on graph exploration history) as opposed to path finding or graph design. The additional results (comparison with AFL and Neuzz required by the reviewers, in the rebuttal) are also found to be very convincing regarding the efficiency of the approach in the small budget (in terms of input generation) regime. The comparison with expert and SMT solvers (section 4.2) gives a sufficient evidence that the paper brings a contribution at the state of the art in an important niche, the fuzzing of small scale programs. The comparison of the different encoding of the programs (bag of words, BiLSTM and GraphNN) is also very interesting. For all these reasons, the Area Chair thinks the paper can be accepted. In the camera-ready version, the authors will do their best to situate their contribution w.r.t. the application domains, make it clear that GNN can be trained to perform smart exploration, and give precisions on the scalability of the approach.