Part of Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track
Ilias Diakonikolas, Daniel Kane, Jasper Lee, Ankit Pensia, Thanasis Pittas
We study the problem of list-decodable Gaussian covariance estimation. Given a multiset T of n points in Rd such that an unknown α<1/2 fraction of points in T are i.i.d. samples from an unknown Gaussian N(μ,Σ), the goal is to output a list of O(1/α) hypotheses at least one of which is close to Σ in relative Frobenius norm. Our main result is a poly(d,1/α) sample and time algorithm for this task that guarantees relative Frobenius norm error of poly(1/α). Importantly, our algorithm relies purely on spectral techniques. As a corollary, we obtain an efficient spectral algorithm for robust partial clustering of Gaussian mixture models (GMMs) --- a key ingredient in the recent work of [BakDJKKV22] on robustly learning arbitrary GMMs. Combined with the other components of [BakDJKKV22], our new method yields the first Sum-of-Squares-free algorithm for robustly learning GMMs, resolving an open problem proposed by Vempala and Kothari. At the technical level, we develop a novel multi-filtering method for list-decodable covariance estimation that may be useful in other settings.