Near-Minimax Optimal Classification with Dyadic Classification Trees

Part of Advances in Neural Information Processing Systems 16 (NIPS 2003)

Bibtex Metadata Paper

Authors

Clayton Scott, Robert Nowak

Abstract

This paper reports on a family of computationally practical classifiers that converge to the Bayes error at near-minimax optimal rates for a va- riety of distributions. The classifiers are based on dyadic classification trees (DCTs), which involve adaptively pruned partitions of the feature space. A key aspect of DCTs is their spatial adaptivity, which enables lo- cal (rather than global) fitting of the decision boundary. Our risk analysis involves a spatial decomposition of the usual concentration inequalities, leading to a spatially adaptive, data-dependent pruning criterion. For any distribution on (X, Y ) whose Bayes decision boundary behaves locally like a Lipschitz smooth function, we show that the DCT error converges to the Bayes error at a rate within a logarithmic factor of the minimax optimal rate. We also study DCTs equipped with polynomial classifica- tion rules at each leaf, and show that as the smoothness of the boundary increases their errors converge to the Bayes error at a rate approaching n−1/2, the parametric rate. We are not aware of any other practical classi- fiers that provide similar rate of convergence guarantees. Fast algorithms for tree pruning are discussed.