Part of Advances in Neural Information Processing Systems 24 (NIPS 2011)
Yair Wiener, Ran El-Yaniv
For a learning problem whose associated excess loss class is $(\beta,B)$-Bernstein, we show that it is theoretically possible to track the same classification performance of the best (unknown) hypothesis in our class, provided that we are free to abstain from prediction in some region of our choice. The (probabilistic) volume of this rejected region of the domain is shown to be diminishing at rate $O(B\theta (\sqrt{1/m}))^\beta)$, where $\theta$ is Hanneke's disagreement coefficient. The strategy achieving this performance has computational barriers because it requires empirical error minimization in an agnostic setting. Nevertheless, we heuristically approximate this strategy and develop a novel selective classification algorithm using constrained SVMs. We show empirically that the resulting algorithm consistently outperforms the traditional rejection mechanism based on distance from decision boundary.