Covariance Estimation for High Dimensional Data Vectors Using the Sparse Matrix Transform

Part of Advances in Neural Information Processing Systems 21 (NIPS 2008)

Bibtex Metadata Paper


Guangzhi Cao, Charles Bouman


Covariance estimation for high dimensional vectors is a classically difficult problem in statistical analysis and machine learning due to limited sample size. In this paper, we propose a new approach to covariance estimation, which is based on constrained maximum likelihood (ML) estimation of the covariance. Specifically, the covariance is constrained to have an eigen decomposition which can be represented as a sparse matrix transform (SMT). The SMT is formed by a product of pairwise coordinate rotations known as Givens rotations. Using this framework, the covariance can be efficiently estimated using greedy minimization of the log likelihood function, and the number of Givens rotations can be efficiently computed using a cross-validation procedure. The estimator obtained using this method is always positive definite and well-conditioned even with limited sample size. Experiments on hyperspectral data show that SMT covariance estimation results in consistently better estimates of the covariance for a variety of different classes and sample sizes compared to traditional shrinkage estimators.