Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning
Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S Meel, N. V. Vinodchandran
We design efficient distance approximation algorithms for several classes of well-studied structured high-dimensional distributions. Specifically, we present algorithms for the following problems (where dTV is the total variation distance):
- Given sample access to two Bayesian networks P1 and P2 over known directed acyclic graphs G1 and G2 having n nodes and bounded in-degree, approximate dTV(P1, P2) to within additive error ε using poly(n, 1/ε) samples and time.
- Given sample access to two ferromagnetic Ising models P1 and P2 on n variables with bounded width, approximate dTV(P1 , P2 ) to within additive error ε using poly(n, 1/ε) samples and time.
- Given sample access to two n-dimensional Gaussians P1 and P2, approximate dTV(P1, P2) to within additive error ε using poly(n, 1/ε) samples and time.
- Given access to observations from two causal models P and Q on n variables that are defined over known causal graphs, approximate dTV(Pa , Qa ) to within additive error ε using poly(n, 1/ε) samples and time, where Pa and Qa are the interventional distributions obtained by the intervention do(A = a) on P and Q respectively for a particular variable A.
The distance approximation algorithms immediately imply new tolerant closeness testers for the corresponding classes of distributions. Prior to our work, only non-tolerant testers were known for both Bayes net distributions and Ising models, and no testers with quantitative guarantee were known for interventional distributions. To best of our knowledge, efficient distance approximation algorithms for Gaussian distributions were not present in the literature. Our algorithms are designed using a conceptually simple but general framework that is applicable to a variety of scenarios.