Amir Globerson, Gal Chechik, Fernando Pereira, Naftali Tishby
Embedding algorithms search for low dimensional structure in complex data, but most algorithms only handle objects of a single type for which pairwise distances are specified. This paper describes a method for em- bedding objects of different types, such as images and text, into a single common Euclidean space based on their co-occurrence statistics. The joint distributions are modeled as exponentials of Euclidean distances in the low-dimensional embedding space, which links the problem to con- vex optimization over positive semidefinite matrices. The local struc- ture of our embedding corresponds to the statistical correlations via ran- dom walks in the Euclidean space. We quantify the performance of our method on two text datasets, and show that it consistently and signifi- cantly outperforms standard methods of statistical correspondence mod- eling, such as multidimensional scaling and correspondence analysis.