Part of Advances in Neural Information Processing Systems 17 (NIPS 2004)
Tao Xiong, Jieping Ye, Qi Li, Ravi Janardan, Vladimir Cherkassky
Linear Discriminant Analysis (LDA) is a well-known method for fea- ture extraction and dimension reduction. It has been used widely in many applications such as face recognition. Recently, a novel LDA algo- rithm based on QR Decomposition, namely LDA/QR, has been proposed, which is competitive in terms of classification accuracy with other LDA algorithms, but it has much lower costs in time and space. However, LDA/QR is based on linear projection, which may not be suitable for data with nonlinear structure. This paper first proposes an algorithm called KDA/QR, which extends the LDA/QR algorithm to deal with nonlin- ear data by using the kernel operator. Then an efficient approximation of KDA/QR called AKDA/QR is proposed. Experiments on face image data show that the classification accuracy of both KDA/QR and AKDA/QR are competitive with Generalized Discriminant Analysis (GDA), a gen- eral kernel discriminant analysis algorithm, while AKDA/QR has much lower time and space costs.
Linear Discriminant Analysis  is a wellknown method for dimension reduction. It has been used widely in many applications such as face recognition . Classical LDA aims to find optimal transformation by minimizing the within-class distance and maximizing the between-class distance simultaneously, thus achieving maximum discrimination. The optimal transformation can be readily computed by computing the eigen-decomposition on the scatter matrices.
Although LDA works well for linear problems, it may be less effective when severe non- linearity is involved. To deal with such a limitation, nonlinear extensions through kernel functions have been proposed. The main idea of kernel-based methods is to map the input data to a feature space through a nonlinear mapping, where the inner products in the feature
space can be computed by a kernel function without knowing the nonlinear mapping explic- itly . Kernel Principal Component Analysis (KPCA) , Kernel Fisher Discriminant Analysis (KFDA)  and Generalized Discriminant Analysis (GDA)  are, respectively, kernel-based nonlinear extensions of the well known PCA, FDA and LDA methods.
To our knowledge, there are few efficient algorithms for general kernel based discriminant algorithms -- most known algorithms effectively scale as O(n3) where n is the sample size. In [6, 8], S. Mika et al. made a first attempt to speed up KFDA through a greedy approximation technique. However the algorithm was developed to handle the binary clas- sification problem. For multi-class problem, the authors suggested the one against the rest scheme by considering all two-class problems.
Recently, an efficient variant of LDA, namely LDA/QR, was proposed in [11, 12]. The essence of LDA/QR is the utilization of QR-decomposition on a small size matrix. The time complexity of LDA/QR is linear in the size of the training data, as well as the number of dimensions of the data. Moreover, experiments in [11, 12] show that the classification accuracy of LDA/QR is competitive with other LDA algorithms.
In this paper, we first propose an algorithm, namely KDA/QR1, which is a nonlinear exten- sion of LDA/QR. Since KDA/QR involves the whole kernel matrix, which is not scalable for large datasets, we also propose an approximation of KDA/QR, namely AKDA/QR. A distinct property of AKDA/QR is that it scales as O(ndc), where n is the size of the data, d is the dimension of the data, and c is the number of classes.
We apply the proposed algorithms on face image datasets and compare them with LDA/QR, and Generalized Discriminant Analysis (GDA) , a general method for kernel discrim- inant analysis. Experiments show that: (1) AKDA/QR is competitive with KDA/QR and GDA in classification; (2) both KDA/QR and AKDA/QR outperform LDA/QR in classifi- cation; and (3) AKDA/QR has much lower costs in time and space than GDA.