Part of Advances in Neural Information Processing Systems 22 (NIPS 2009)
Grzegorz Swirszcz, Naoki Abe, Aurelie C. Lozano
We consider the problem of variable group selection for least squares regression, namely, that of selecting groups of variables for best regression performance, leveraging and adhering to a natural grouping structure within the explanatory variables. We show that this problem can be efficiently addressed by using a certain greedy style algorithm. More precisely, we propose the Group Orthogonal Matching Pursuit algorithm (Group-OMP), which extends the standard OMP procedure (also referred to as ``forward greedy feature selection algorithm for least squares regression) to perform stage-wise group variable selection. We prove that under certain conditions Group-OMP can identify the correct (groups of) variables. We also provide an upperbound on the $l_\infty$ norm of the difference between the estimated regression coefficients and the true coefficients. Experimental results on simulated and real world datasets indicate that Group-OMP compares favorably to Group Lasso, OMP and Lasso, both in terms of variable selection and prediction accuracy.