We investigate a new kernel-based classifier: the Kernel Fisher Discrim(cid:173) inant (KFD). A mathematical programming formulation based on the ob(cid:173) servation that KFD maximizes the average margin permits an interesting modification of the original KFD algorithm yielding the sparse KFD. We find that both, KFD and the proposed sparse KFD, can be understood in an unifying probabilistic context. Furthermore, we show connections to Support Vector Machines and Relevance Vector Machines. From this understanding, we are able to outline an interesting kernel-regression technique based upon the KFD algorithm. Simulations support the use(cid:173) fulness of our approach.