Part of Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
Jayadev Acharya, Clement Canonne, Yuhan Liu, Ziteng Sun, Himanshu Tyagi
We obtain tight minimax rates for the problem of distributed estimation of discrete distributions under communication constraints, where $n$ users observing $m $ samples each can broadcast only $\ell$ bits. Our main result is a tight characterization (up to logarithmic factors) of the error rate as a function of $m$, $\ell$, the domain size, and the number of users under most regimes of interest. While previous work focused on the setting where each user only holds one sample, we show that as $m$ grows the $\ell_1$ error rate gets reduced by a factor of $\sqrt{m}$ for small $m$. However, for large $m$ we observe an interesting phase transition: the dependence of the error rate on the communication constraint $\ell$ changes from $1/\sqrt{2^{\ell}}$ to $1/\sqrt{\ell}$.