Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Andreas Grammenos, Rodrigo Mendoza Smith, Jon Crowcroft, Cecilia Mascolo
We present a federated, asynchronous, and (ε,δ)-differentially private algorithm for \PCA in the memory-limited setting. % Our algorithm incrementally computes local model updates using a streaming procedure and adaptively estimates its r leading principal components when only O(dr) memory is available with d being the dimensionality of the data. % We guarantee differential privacy via an input-perturbation scheme in which the covariance matrix of a dataset \BX∈\Rd×n is perturbed with a non-symmetric random Gaussian matrix with variance in O((dn)2logd), thus improving upon the state-of-the-art. % Furthermore, contrary to previous federated or distributed algorithms for \PCA, our algorithm is also invariant to permutations in the incoming data, which provides robustness against straggler or failed nodes. % Numerical simulations show that, while using limited-memory, our algorithm exhibits performance that closely matches or outperforms traditional non-federated algorithms, and in the absence of communication latency, it exhibits attractive horizontal scalability.