Nikhil Ghosh, Yuxin Chen, Yisong Yue
In this paper, we aim to learn a low-dimensional Euclidean representation from a set of constraints of the form “item j is closer to item i than item k”. Existing approaches for this “ordinal embedding” problem require expensive optimization procedures, which cannot scale to handle increasingly larger datasets. To address this issue, we propose a landmark-based strategy, which we call Landmark Ordinal Embedding (LOE). Our approach trades off statistical efficiency for computational efficiency by exploiting the low-dimensionality of the latent embedding. We derive bounds establishing the statistical consistency of LOE under the popular Bradley- Terry-Luce noise model. Through a rigorous analysis of the computational complexity, we show that LOE is significantly more efficient than conventional ordinal embedding approaches as the number of items grows. We validate these characterizations empirically on both synthetic and real datasets. We also present a practical approach that achieves the “best of both worlds”, by using LOE to warm-start existing methods that are more statistically efficient but computationally expensive.