Processing math: 81%

Learning a Single Neuron Robustly to Distributional Shifts and Adversarial Label Noise

Part of Advances in Neural Information Processing Systems 37 (NeurIPS 2024) Main Conference Track

Bibtex Paper

Authors

Shuyao Li, Sushrut Karmalkar, Ilias Diakonikolas, Jelena Diakonikolas

Abstract

We study the problem of learning a single neuron with respect to the L22-loss in the presence of adversarial distribution shifts, where the labels can be arbitrary, and the goal is to find a "best-fit" function.More precisely, given training samples from a reference distribution p0, the goal is to approximate the vector wwhich minimizes the squared loss with respect to the worst-case distribution that is close in χ2-divergence to p0.We design a computationally efficient algorithm that recovers a vector ˆwsatisfying E_p(σ(ˆwx)y)2CE_p(σ(wx)y)2+ϵ, where C>1 is a dimension-independent constant and (w,p) is the witness attaining the min-max riskmin.Our algorithm follows the primal-dual framework and is designed by directly bounding the risk with respect to the original, nonconvex L_2^2 loss.From an optimization standpoint, our work opens new avenues for the design of primal-dual algorithms under structured nonconvexity.