Gert Lanckriet, Laurent Ghaoui, Chiranjib Bhattacharyya, Michael Jordan
When constructing a classifier, the probability of correct classifi(cid:173) cation of future data points should be maximized. In the current paper this desideratum is translated in a very direct way into an optimization problem, which is solved using methods from con(cid:173) vex optimization. We also show how to exploit Mercer kernels in this setting to obtain nonlinear decision boundaries. A worst-case bound on the probability of misclassification of future data is ob(cid:173) tained explicitly.