Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation

Part of Advances in Neural Information Processing Systems 20 (NIPS 2007)

Bibtex Metadata Paper


Masashi Sugiyama, Shinichi Nakajima, Hisashi Kashima, Paul Buenau, Motoaki Kawanabe


When training and test samples follow different input distributions (i.e., the situation called \emph{covariate shift}), the maximum likelihood estimator is known to lose its consistency. For regaining consistency, the log-likelihood terms need to be weighted according to the \emph{importance} (i.e., the ratio of test and training input densities). Thus, accurately estimating the importance is one of the key tasks in covariate shift adaptation. A naive approach is to first estimate training and test input densities and then estimate the importance by the ratio of the density estimates. However, since density estimation is a hard problem, this approach tends to perform poorly especially in high dimensional cases. In this paper, we propose a direct importance estimation method that does not require the input density estimates. Our method is equipped with a natural model selection procedure so tuning parameters such as the kernel width can be objectively optimized. This is an advantage over a recently developed method of direct importance estimation. Simulations illustrate the usefulness of our approach.