Federated Principal Component Analysis

Part of Advances in Neural Information Processing Systems 33 pre-proceedings (NeurIPS 2020)

Bibtex »Paper »Supplemental »

Bibtek download is not availble in the pre-proceeding


Andreas Grammenos, Rodrigo Mendoza Smith, Jon Crowcroft, Cecilia Mascolo


We present a federated, asynchronous, and $(\varepsilon, \delta)$-differentially private algorithm for $\PCA$ in the memory-limited setting. % Our algorithm incrementally computes local model updates using a streaming procedure and adaptively estimates its $r$ leading principal components when only $\mathcal{O}(dr)$ memory is available with $d$ being the dimensionality of the data. % We guarantee differential privacy via an input-perturbation scheme in which the covariance matrix of a dataset $\B{X} \in \R^{d \times n}$ is perturbed with a non-symmetric random Gaussian matrix with variance in $\mathcal{O}\left(\left(\frac{d}{n}\right)^2 \log d \right)$, thus improving upon the state-of-the-art. % Furthermore, contrary to previous federated or distributed algorithms for $\PCA$, our algorithm is also invariant to permutations in the incoming data, which provides robustness against straggler or failed nodes. % Numerical simulations show that, while using limited-memory, our algorithm exhibits performance that closely matches or outperforms traditional non-federated algorithms, and in the absence of communication latency, it exhibits attractive horizontal scalability.