Part of Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
Ilias Diakonikolas, Daniel Kane, Ankit Pensia, Thanasis Pittas, Alistair Stewart
We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples. Specifically, we are given a set T of labeled examples (x,y)∈Rd×R and a parameter 0<α<1/2 such that an α-fraction of the points in T are i.i.d. samples from a linear regression model with Gaussian covariates, and the remaining (1−α)-fraction of the points are drawn from an arbitrary noise distribution. The goal is to output a small list of hypothesis vectors such that at least one of them is close to the target regression vector. Our main result is a Statistical Query (SQ) lower bound of dpoly(1/α) for this problem. Our SQ lower bound qualitatively matches the performance of previously developed algorithms, providing evidence that current upper bounds for this task are nearly best possible.