Part of Advances in Neural Information Processing Systems 14 (NIPS 2001)
Guy Lebanon, John Lafferty
We derive an equivalence between AdaBoost and the dual of a convex optimization problem, showing that the only difference between mini- mizing the exponential loss used by AdaBoost and maximum likelihood for exponential models is that the latter requires the model to be normal- ized to form a conditional probability distribution over labels. In addi- tion to establishing a simple and easily understood connection between the two methods, this framework enables us to derive new regularization procedures for boosting that directly correspond to penalized maximum likelihood. Experiments on UCI datasets support our theoretical analy- sis and give additional insight into the relationship between boosting and logistic regression.