Part of Advances in Neural Information Processing Systems 15 (NIPS 2002)
Jean-philippe Vert, Minoru Kanehisa
We present an algorithm to extract features from high-dimensional gene expression profiles, based on the knowledge of a graph which links to- gether genes known to participate to successive reactions in metabolic pathways. Motivated by the intuition that biologically relevant features are likely to exhibit smoothness with respect to the graph topology, the algorithm involves encoding the graph and the set of expression pro- files into kernel functions, and performing a generalized form of canoni- cal correlation analysis in the corresponding reproducible kernel Hilbert spaces. Function prediction experiments for the genes of the yeast S. Cerevisiae validate this approach by showing a consistent increase in performance when a state-of-the-art classifier uses the vector of features instead of the original expression profile to predict the functional class of a gene.