Mauro Scanagatta, Cassio P. de Campos, Giorgio Corani, Marco Zaffalon
We present a method for learning Bayesian networks from data sets containingthousands of variables without the need for structure constraints. Our approachis made of two parts. The first is a novel algorithm that effectively explores thespace of possible parent sets of a node. It guides the exploration towards themost promising parent sets on the basis of an approximated score function thatis computed in constant time. The second part is an improvement of an existingordering-based algorithm for structure optimization. The new algorithm provablyachieves a higher score compared to its original formulation. On very large datasets containing up to ten thousand nodes, our novel approach consistently outper-forms the state of the art.