Part of Advances in Neural Information Processing Systems 28 (NIPS 2015)
Wooseok Ha, Rina Foygel Barber
The robust principal component analysis (RPCA) problem seeks to separate low-rank trends from sparse outlierswithin a data matrix, that is, to approximate a $n\times d$ matrix $D$ as the sum of a low-rank matrix $L$ and a sparse matrix $S$.We examine the robust principal component analysis (RPCA) problem under data compression, wherethe data $Y$ is approximately given by $(L + S)\cdot C$, that is, a low-rank $+$ sparse data matrix that has been compressed to size $n\times m$ (with $m$ substantially smaller than the original dimension $d$) via multiplication witha compression matrix $C$. We give a convex program for recovering the sparse component $S$ along with the compressed low-rank component $L\cdot C$, along with upper bounds on the error of this reconstructionthat scales naturally with the compression dimension $m$ and coincides with existing results for the uncompressedsetting $m=d$. Our results can also handle error introduced through additive noise or through missing data.The scaling of dimension, compression, and signal complexity in our theoretical results is verified empirically through simulations, and we also apply our method to a data set measuring chlorine concentration acrossa network of sensors, to test its performance in practice.