Part of Advances in Neural Information Processing Systems 18 (NIPS 2005)
Miroslav Dudík, Steven Phillips, Robert E. Schapire
We study the problem of maximum entropy density estimation in the presence of known sample selection bias. We propose three bias cor- rection approaches. The ﬁrst one takes advantage of unbiased sufﬁcient statistics which can be obtained from biased samples. The second one es- timates the biased distribution and then factors the bias out. The third one approximates the second by only using samples from the sampling distri- bution. We provide guarantees for the ﬁrst two approaches and evaluate the performance of all three approaches in synthetic experiments and on real data from species habitat modeling, where maxent has been success- fully applied and where sample selection bias is a signiﬁcant problem.