Part of Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
Ishaq Aden-Ali, Hassan Ashtiani, Christopher Liaw
We consider the problem of learning multivariate Gaussians under the constraint of approximate differential privacy. We prove that ˜O(k2dlog3/2(1/δ)/α2ε) samples are sufficient to learn a mixture of k axis-aligned Gaussians in Rd to within total variation distance α while satisfying (ε,δ)-differential privacy. This is the first result for privately learning mixtures of unbounded axis-aligned (or even unbounded univariate) Gaussians. If the covariance matrices of each of the Gaussians is the identity matrix, we show that ˜O(kd/α2+kdlog(1/δ)/αε) samples are sufficient.To prove our results, we design a new technique for privately learning mixture distributions. A class of distributions F is said to be list-decodable if there is an algorithm that, given "heavily corrupted" samples from f∈F, outputs a list of distributions one of which approximates f. We show that if F is privately list-decodable then we can learn mixtures of distributions in F. Finally, we show axis-aligned Gaussian distributions are privately list-decodable, thereby proving mixtures of such distributions are privately learnable.