Jiaxuan You, Haoze Wu, Clark Barrett, Raghuram Ramanujan, Jure Leskovec
The Boolean Satisfiability (SAT) problem is the canonical NP-complete problem and is fundamental to computer science, with a wide array of applications in planning, verification, and theorem proving. Developing and evaluating practical SAT solvers relies on extensive empirical testing on a set of real-world benchmark formulas. However, the availability of such real-world SAT formulas is limited. While these benchmark formulas can be augmented with synthetically generated ones, existing approaches for doing so are heavily hand-crafted and fail to simultaneously capture a wide range of characteristics exhibited by real-world SAT instances. In this work, we present G2SAT, the first deep generative framework that learns to generate SAT formulas from a given set of input formulas. Our key insight is that SAT formulas can be transformed into latent bipartite graph representations which we model using a specialized deep generative neural network. We show that G2SAT can generate SAT formulas that closely resemble given real-world SAT instances, as measured by both graph metrics and SAT solver behavior. Further, we show that our synthetic SAT formulas could be used to improve SAT solver performance on real-world benchmarks, which opens up new opportunities for the continued development of SAT solvers and a deeper understanding of their performance.