{"title": "SerialRank: Spectral Ranking using Seriation", "book": "Advances in Neural Information Processing Systems", "page_first": 900, "page_last": 908, "abstract": "We describe a seriation algorithm for ranking a set of n items given pairwise comparisons between these items. Intuitively, the algorithm assigns similar rankings to items that compare similarly with all others. It does so by constructing a similarity matrix from pairwise comparisons, using seriation methods to reorder this matrix and construct a ranking. We first show that this spectral seriation algorithm recovers the true ranking when all pairwise comparisons are observed and consistent with a total order. We then show that ranking reconstruction is still exact even when some pairwise comparisons are corrupted or missing, and that seriation based spectral ranking is more robust to noise than other scoring methods. An additional benefit of the seriation formulation is that it allows us to solve semi-supervised ranking problems. Experiments on both synthetic and real datasets demonstrate that seriation based spectral ranking achieves competitive and in some cases superior performance compared to classical ranking methods.", "full_text": "SerialRank: Spectral Ranking using Seriation\n\nFajwel Fogel\n\nfogel@cmap.polytechnique.fr\n\nC.M.A.P., \u00b4Ecole Polytechnique,\n\nPalaiseau, France\n\nAlexandre d\u2019Aspremont\n\nCNRS & D.I., \u00b4Ecole Normale Sup\u00b4erieure\n\nParis, France\n\naspremon@ens.fr\n\nMilan Vojnovic\n\nMicrosoft Research,\n\nCambridge, UK\n\nmilanv@microsoft.com\n\nAbstract\n\nWe describe a seriation algorithm for ranking a set of n items given pairwise\ncomparisons between these items. Intuitively, the algorithm assigns similar rank-\nings to items that compare similarly with all others. It does so by constructing a\nsimilarity matrix from pairwise comparisons, using seriation methods to reorder\nthis matrix and construct a ranking. We \ufb01rst show that this spectral seriation al-\ngorithm recovers the true ranking when all pairwise comparisons are observed\nand consistent with a total order. We then show that ranking reconstruction is\nstill exact even when some pairwise comparisons are corrupted or missing, and\nthat seriation based spectral ranking is more robust to noise than other scoring\nmethods. An additional bene\ufb01t of the seriation formulation is that it allows us to\nsolve semi-supervised ranking problems. Experiments on both synthetic and real\ndatasets demonstrate that seriation based spectral ranking achieves competitive\nand in some cases superior performance compared to classical ranking methods.\n\n1\n\nIntroduction\n\nWe study the problem of ranking a set of n items given pairwise comparisons between these items.\nIn practice, the information about pairwise comparisons is usually incomplete, especially in the case\nof a large set of items, and the data may also be noisy, that is some pairwise comparisons could be\nincorrectly measured and incompatible with the existence of a total ordering.\nRanking is a classic problem but its formulations vary widely. For example, website ranking methods\nsuch as PageRank [Page et al., 1998] and HITS [Kleinberg, 1999] seek to rank web pages based on\nthe hyperlink structure of the web, where links do not necessarily express consistent preference\nrelationships (e.g. a can link to b and b can link c, and c can link to a). The setting we study here\ngoes back at least to [Kendall and Smith, 1940] and seeks to reconstruct a ranking between items\nfrom pairwise comparisons re\ufb02ecting a total ordering.\nIn this case, the directed graph of all pairwise comparisons, where every pair of vertices is connected\nby exactly one of two possible directed edges, is usually called a tournament graph in the theoretical\ncomputer science literature or a \u201cround robin\u201d in sports, where every player plays every other player\nonce and each preference marks victory or defeat. The motivation for this formulation often stems\nfrom the fact that in many applications, e.g. music, images, and movies, preferences are easier to\nexpress in relative terms (e.g. a is better than b) rather than absolute ones (e.g. a should be ranked\nfourth, and b seventh).\n\n1\n\n\fAssumptions about how the pairwise preference information is obtained also vary widely. A subset\nof preferences is measured adaptively in [Ailon, 2011; Jamieson and Nowak, 2011], while [Negah-\nban et al., 2012], for example, assume that preferences are observed iteratively, and [Freund et al.,\n2003] extract them at random. In other settings, the full preference matrix is observed, but is per-\nturbed by noise: in e.g. [Bradley and Terry, 1952; Luce, 1959; Herbrich et al., 2006], a parametric\nmodel is assumed over the set of permutations, which reformulates ranking as a maximum likelihood\nproblem.\nLoss function and algorithmic approaches vary as well. Kenyon-Mathieu and Schudy [2007], for\nexample, derive a PTAS for the minimum feedback arc set problem on tournaments, i.e. the problem\nof \ufb01nding a ranking that minimizes the number of upsets (a pair of players where the player ranked\nlower on the ranking beats the player ranked higher). In practice, the complexity of this method is\nrelatively high, and other authors [see e.g. Keener, 1993; Negahban et al., 2012] have been using\nspectral methods to produce more ef\ufb01cient algorithms (each pairwise comparison is understood as a\nlink pointing to the preferred item). Simple scoring methods such as the point difference rule [Huber,\n1963; Wauthier et al., 2013] produce ef\ufb01cient estimates at very low computational cost. Ranking\nhas also been approached as a prediction problem, i.e. learning to rank [Schapire and Singer, 1998],\nwith [Joachims, 2002] for example using support vector machines to learn a score function. Finally,\nin the Bradley-Terry-Luce framework, the maximum likelihood problem is usually solved using\n\ufb01xed point algorithms or EM-like majorization-minimization techniques [Hunter, 2004] for which\nno precise computational complexity bounds are known.\nHere, we show that the ranking problem is directly related to another classical ordering problem,\nnamely seriation: we are given a similarity matrix between a set of n items and assume that the items\ncan be ordered along a chain such that the similarity between items decreases with their distance\nwithin this chain (i.e. a total order exists). The seriation problem then seeks to reconstruct the\nunderlying linear ordering based on unsorted, possibly noisy, pairwise similarity information. Atkins\net al. [1998] produced a spectral algorithm that exactly solves the seriation problem in the noiseless\ncase, by showing that for similarity matrices computed from serial variables, the ordering of the\nsecond eigenvector of the Laplacian matrix (a.k.a. the Fiedler vector) matches that of the variables.\nIn practice, this means that spectral clustering exactly reconstructs the correct ordering provided\nitems are organized in a chain. Here, adapting these results to ranking produces a very ef\ufb01cient\npolynomial-time ranking algorithm with provable recovery and robustness guarantees. Furthermore,\nthe seriation formulation allows us to handle semi-supervised ranking problems. Fogel et al. [2013]\nshow that seriation is equivalent to the 2-SUM problem and study convex relaxations to seriation\nin a semi-supervised setting, where additional structural constraints are imposed on the solution.\nSeveral authors [Blum et al., 2000; Feige and Lee, 2007] have also focused on the directly related\nMinimum Linear Arrangement (MLA) problem, for which excellent approximation guarantees exist\nin the noisy case, albeit with very high polynomial complexity.\nThe main contributions of this paper can be summarized as follows. We link seriation and ranking by\nshowing how to construct a consistent similarity matrix based on consistent pairwise comparisons.\nWe then recover the true ranking by applying the spectral seriation algorithm in [Atkins et al., 1998]\nto this similarity matrix (we call this method SerialRank in what follows). In the noisy case, we\nthen show that spectral seriation can perfectly recover the true ranking even when some of the\npairwise comparisons are either corrupted or missing, provided that the pattern of errors is relatively\nunstructured. We show in particular that, in a regime where a high proportion of comparions are\nobserved, some incorrectly, the spectral solution is more robust to noise than classical scoring based\nmethods. Finally, we use the seriation results in [Fogel et al., 2013] to produce semi-supervised\nranking solutions.\nThe paper is organized as follows. In Section 2 we recall de\ufb01nitions related to seriation, and link\nranking and seriation by showing how to construct well ordered similarity matrices from well ranked\nitems. In Section 3 we apply the spectral algorithm of [Atkins et al., 1998] to reorder these similarity\nmatrices and reconstruct the true ranking in the noiseless case. In Section 4 we then show that this\nspectral solution remains exact in a noisy regime where a random subset of comparisons is corrupted.\nFinally, in Section 5 we illustrate our results on both synthetic and real datasets, and compare ranking\nperformance with classical maximum likelihood, spectral and scoring based approaches. Auxiliary\ntechnical results are detailed in Appendix A.\n\n2\n\n\f2 Seriation, Similarities & Ranking\n\nIn this section we \ufb01rst introduce the seriation problem, i.e.\nreordering items based on pairwise\nsimilarities. We then show how to write the problem of ranking given pairwise comparisons as a\nseriation problem.\n\n2.1 The Seriation Problem\n\nThe seriation problem seeks to reorder n items given a similarity matrix between these items, such\nthat the more similar two items are, the closer they should be. This is equivalent to supposing that\nitems can be placed on a chain where the similarity between two items decreases with the distance\nbetween these items in the chain. We formalize this below, following [Atkins et al., 1998].\nDe\ufb01nition 2.1 We say that the matrix A 2 Sn is an R-matrix (or Robinson matrix) if and only if it\nis symmetric and Ai,j \uf8ff Ai,j+1 and Ai+1,j \uf8ff Ai,j in the lower triangle, where 1 \uf8ff j < i \uf8ff n.\nAnother way to formulate R-matrix conditions is to impose Aij \uf8ff Akl if |i  j|\uf8ff| k  l| off-\ndiagonal, i.e. the coef\ufb01cients of A decrease as we move away from the diagonal. We also introduce\na de\ufb01nition for strict R-matrices A, whose rows/columns cannot be permuted without breaking the\nR-matrix monotonicity conditions. We call reverse identity permutation the permutation that puts\nrows and columns {1, . . . , n} of a matrix A in reverse order {n, n  1, . . . , 1}.\nDe\ufb01nition 2.2 An R-matrix A 2 Sn is called strict-R if and only if the identity and reverse identity\npermutations of A are the only permutations producing R-matrices.\n\nAny R-matrix with only strict R-constraints is a strict R-matrix. Following [Atkins et al., 1998], we\nwill say that A is pre-R if there is a permutation matrix \u21e7 such that \u21e7A\u21e7T is a R-matrix. Given\na pre-R matrix A, the seriation problem consists in \ufb01nding a permutation \u21e7 such that \u21e7A\u21e7T is a\nR-matrix. Note that there might be several solutions to this problem. In particular, if a permutation\n\u21e7 is a solution, then the reverse permutation is also a solution. When only two permutations of A\nproduce R-matrices, A will be called pre-strict-R.\n\n2.2 Constructing Similarity Matrices from Pairwise Comparisons\n\nGiven an ordered input pairwise comparison matrix, we now show how to construct a similarity\nmatrix which is strict-R when all comparisons are given and consistent with the identity ranking\n(i.e. items are ranked in the increasing order of indices). This means that the similarity between\ntwo items decreases with the distance between their ranks. We will then be able to use the spectral\nseriation algorithm by [Atkins et al., 1998] described in Section 3 to recover the true ranking from a\ndisordered similarity matrix.\nWe \ufb01rst explain how to compute a pairwise similarity from binary comparisons between items by\ncounting the number of matching comparisons. Another formulation allows to handle the general-\nized linear model.\n\n2.2.1 Similarities from Pairwise Comparisons\nSuppose we are given a matrix of pairwise comparisons C 2 {1, 0, 1}n\u21e5n such that Ci,j +Cj,i = 0\nfor every i 6= j and\n\nCi,j =( 1\n\n0\n1\n\nif i is ranked higher than j\nif i and j are not compared or in a draw\nif j is ranked higher than i\n\n(1)\n\nand, by convention, we de\ufb01ne Ci,i = 1 for all i 2{ 1, . . . , n} (Ci,i values have no effect in the\nranking method presented in algorithm SerialRank). We also de\ufb01ne the pairwise similarity matrix\nSmatch as\n\nSmatch\ni,j\n\n=\n\nnXk=1\u2713 1 + Ci,kCj,k\n\n2\n\n\u25c6 .\n\n3\n\n(2)\n\n\fi,j\n\nSince Ci,kCj,k = 1 if Ci,k and Cj,k have same signs, and Ci,kCj,k = 1 if they have opposite\ncounts the number of matching comparisons between i and j with other reference\nsigns, Smatch\nitems k. If i or j is not compared with k, then Ci,kCj,k = 0 and the term (1 + Ci,kCj,k)/2 has an\naverage effect on the similarity of 1/2. The intuition behind this construction is easy to understand\nin a tournament setting: players that beat the same players and are beaten by the same players should\nhave a similar ranking. We can write Smatch in the following equivalent form\n\n(3)\n\n(4)\n\nSmatch =\n\n1\n\n2n11T + CCT .\n\nWithout loss of generality, we assume in the following propositions that items are ranked in in-\ncreasing order of their indices (identity ranking). In the general case, we simply replace the strict-R\nproperty by the pre-strict-R property.\nThe next result shows that when all comparisons are given and consistent with the identity ranking,\nthen the similarity matrix Smatch is a strict R-matrix.\nProposition 2.3 Given all pairwise comparisons Ci,j 2 {1, 0, 1} between items ranked according\nto the identity permutation (with no ties), the similarity matrix Smatch constructed as given in (2) is\na strict R-matrix and\n\nfor all i, j = 1, . . . , n.\n\nSmatch\nij\n\n= n  (max{i, j} min{i, j})\n\n2.2.2 Similarities in the Generalized Linear Model\nSuppose that paired comparisons are generated according to a generalized linear model (GLM),\ni.e. we assume that the outcomes of paired comparisons are independent and for any pair of distinct\nitems, item i is observed to be preferred over item j with probability\n\nPi,j = H(\u232bi  \u232bj)\n\n(5)\nwhere \u232b 2 Rn is a vector of strengths or skills parameters and H : R ! [0, 1] is a function that\nis increasing on R and such that H(x) = 1  H(x) for all x 2 R, and limx!1 H(x) = 0\nand limx!1 H(x) = 1. A well known special instance of the generalized linear model is the\nBradley-Terry-Luce model for which H(x) = 1/(1 + ex), for x 2 R.\nLet mi,j be the number of times items i and j were compared, Cs\ni,j 2 {1, 1} be the outcome of\ncomparison s and Q be the matrix of corresponding empirical probabilities, i.e. if mi,j > 0 we have\n\nQi,j =\n\n1\n\nmi,j\n\nCs\n\ni,j + 1\n\n2\n\nmi,jXs=1\n\nand Qi,j = 1/2 in case mi,j = 0. We then de\ufb01ne the similarity matrix Sglm from the observations\nQ as\n\n.\n\n(6)\n\nSince the comparisons are independent we have that Qi,j converges to Pi,j as mi,j goes to in\ufb01nity\nand\n\nSglm\ni,j =\n\nnXk=1\n\n2\n\n{mi,kmj,k>0}\u27131  |Qi,k  Qj,k|\nnXk=1\u27131  |Pi,k  Pj,k|\n\nSglm\ni,j !\n\n2\n\n2\n\n\u25c6 + {mi,kmj,k=0}\n\u25c6 .\n\nThe result below shows that this limit similarity matrix is a strict R-matrix when the variables are\nproperly ordered.\n\nProposition 2.4 If the items are ordered according to the order in decreasing values of the skill\nparameters, in the limit of large number of observations, the similarity matrix Sglm is a strict R\nmatrix.\n\nNotice that we recover the original de\ufb01nition of Smatch in the case of binary probabilities, though\nit does not \ufb01t in the Generalized Linear Model. Note also that these de\ufb01nitions can be directly\nextended to the setting where multiple comparisons are available for each pair and aggregated in\ncomparisons that take fractional values (e.g. in a tournament setting where participants play several\ntimes against each other).\n\n4\n\n\fAlgorithm 1 Using Seriation for Spectral Ranking (SerialRank)\nInput: A set of pairwise comparisons Ci,j 2 {1, 0, 1} or [1, 1].\n1: Compute a similarity matrix S as in \u00a72.2\n2: Compute the Laplacian matrix\n\nLS = diag(S1)  S\n\n(SerialRank)\n\n3: Compute the Fiedler vector of S.\nOutput: A ranking induced by sorting the Fiedler vector of S (choose either increasing or decreas-\n\ning order to minimize the number of upsets).\n\n3 Spectral Algorithms\n\nWe \ufb01rst recall how the spectral clustering approach can be used to recover the true ordering in seri-\nation problems by computing an eigenvector, with computational complexity O(n2 log n) [Kuczyn-\nski and Wozniakowski, 1992]. We then apply this method to the ranking problem.\n\n3.1 Spectral Seriation Algorithm\n\nWe use the spectral computation method originally introduced in [Atkins et al., 1998] to solve the\nseriation problem based on the similarity matrices de\ufb01ned in the previous section. We \ufb01rst recall the\nde\ufb01nition of the Fiedler vector.\n\nDe\ufb01nition 3.1 The Fiedler value of a symmetric, nonnegative and irreducible matrix A is the small-\nest non-zero eigenvalue of its Laplacian matrix LA = diag(A1) A. The corresponding eigenvec-\ntor is called Fiedler vector and is the optimal solution to min{yT LAy : y 2 Rn, yT 1 = 0,kyk2 =\n1}.\nThe main result from [Atkins et al., 1998], detailed below, shows how to reorder pre-R matrices in a\nnoise free case.\nProposition 3.2 [Atkins et al., 1998, Th. 3.3] Let A 2 Sn be an irreducible pre-R-matrix with a\nsimple Fiedler value and a Fiedler vector v with no repeated values. Let \u21e71 2P (respectively, \u21e72)\nbe the permutation such that the permuted Fiedler vector \u21e71v is strictly increasing (decreasing).\nThen \u21e71A\u21e7T\n\n2 are R-matrices, and no other permutations of A produce R-matrices.\n\n1 and \u21e72A\u21e7T\n\n3.2 SerialRank: a Spectral Ranking Algorithm\n\nIn Section 2, we showed that similarities Smatch and Sglm are pre-strict-R when all comparisons\nare available and consistent with an underlying ranking of items. We now use the spectral seriation\nmethod in [Atkins et al., 1998] to reorder these matrices and produce an output ranking. We call this\nalgorithm SerialRank and prove the following result.\n\nProposition 3.3 Given all pairwise comparisons for a set of totally ordered items and assuming\nthere are no ties between items, performing algorithm SerialRank, i.e. sorting the Fiedler vector of\nthe matrix Smatch de\ufb01ned in (3) recovers the true ranking of items.\n\nSimilar results apply for Sglm when we are given enough comparisons in the Generalized Linear\nModel. This last result guarantees recovery of the true ranking of items in the noiseless case. In the\nnext section, we will study the impact of corrupted or missing comparisons on the inferred ranking\nof items.\n\n3.3 Hierarchical Ranking\n\nIn a large dataset, the goal may be to rank only a subset of top rank items. In this case, we can\n\ufb01rst perform spectral ranking (cheap) and then re\ufb01ne the ranking of the top set of items using either\nthe SerialRank algorithm on the top comparison submatrix, or another seriation algorithm such as\n\n5\n\n\fthe convex relaxation in [Fogel et al., 2013]. This last method would also allow us to solve semi-\nsupervised ranking problems, given additional information on the structure of the solution.\n\n4 Robustness to Corrupted and Missing Comparisons\n\nIn this section we study the robustness of SerialRank using Smatch with respect to noisy and missing\npairwise comparisons. We will see that noisy comparisons cause ranking ambiguities for the stan-\ndard point score method and that such ambiguities can be lifted by the spectral ranking algorithm.\nWe show in particular that the SerialRank algorithm recovers the exact ranking when the pattern of\nerrors is random and errors are not too numerous.\nWe de\ufb01ne here the point score wi of an item i, also known as point-difference, or row-sum, as wi =\nk=1 Ck,i which corresponds to the number of wins minus the number of losses in a tournament\n\nPn\nsetting.\nProposition 4.1 Given all pairwise comparisons Cs,t 2 {1, 1} between items ranked according\nto their indices, suppose the signs of m comparisons indexed (i1, j1), . . . , (im, jm) are switched.\n\n1. For the case of one corrupted comparison, if j1 i1 > 2 then the spectral ranking recovers\nthe true ranking whereas the standard point score method induces ties between the pairs of\nitems (i1, i1 + 1) and (j1  1, j1).\n\nholds true\n\n2. For the general case of m  1 corrupted comparisons, suppose that the following condition\n(7)\nthen, Smatch is a strict R-matrix, and thus the spectral ranking recovers the true ranking\nwhereas the standard point score method induces ties between 2m pairs of items.\n\n|i  j| > 2, for all i, j 2{ i1, . . . , im, j1, . . . , jm} such that i 6= j,\n\nFor the case of one corrupted comparison, note that the separation condition on the pair of items\n(i, j) is necessary. When the comparison Ci,j between two adjacent items according to the true\nranking is corrupted, no ranking method can break the resulting tie. For the case of arbitrary number\nof corrupted comparisons, condition (7) is a suf\ufb01cient condition only.\nUsing similar arguments, we can also study conditions for recovering the true ranking in the case\nwith missing comparisons. These scenarios are actually slightly less restrictive than the noisy cases\nand are covered in the supplementary material. We now estimate the number of randomly corrupted\nentries that can be tolerated for perfect recovery of the true ranking.\n\nProposition 4.2 Given a comparison matrix for a set of n items with m corrupted comparisons se-\nlected uniformly at random from the set of all possible item pairs. Algorithm SerialRank guarantees\nthat the probability of recovery p(n, m) satis\ufb01es p(n, m)  1  , provided that m = O(pn). In\nparticular, this implies that p(n, m) = 1  o(1) provided that m = o(pn).\n\ni i+1\n\nj-1\n\nj\n\ni\ni+1\n\nj-1\nj\n\nShift by +1 \n\nShift by -1 \n\nStrict R-constraints \n\nFigure 1: The matrix of pairwise comparisons C (far left) when the rows are ordered according to\nthe true ranking. The corresponding similarity matrix Smatch is a strict R-matrix (center left). The\nsame Smatch similarity matrix with comparison (3,8) corrupted (center right). With one corrupted\ncomparison, Smatch keeps enough strict R-constraints to recover the right permutation. In the noise-\nless case, the difference between all coef\ufb01cients is at least one and after introducing an error, the\ncoef\ufb01cients inside the green rectangles still enforce strict R-constraints (far right).\n\n6\n\n\f5 Numerical Experiments\n\nWe conducted numerical experiments using both synthetic and real datasets to compare the perfor-\nmance of SerialRank with several classical ranking methods.\nSynthetic Datasets The \ufb01rst synthetic dataset consists of a binary matrix of pairwise comparisons\nderived from a given ranking of n items with uniform, randomly distributed corrupted or missing\nentries. A second synthetic dataset consists of a full matrix of pairwise comparisons derived from\na given ranking of n items, with added uncertainty for items which are suf\ufb01ciently close in the\ntrue ranking of items. Speci\ufb01cally, given a positive integer m, we let Ci,j = 1 if i < j  m,\nCi,j \u21e0 Unif[1, 1] if |ij|\uf8ff m, and Ci,j = 1 if i > j+m. In Figure 2, we measure the Kendall \u2327\ncorrelation coef\ufb01cient between the true ranking and the retrieved ranking, when varying either the\npercentage of corrupted comparisons or the percentage of missing comparisons. Kendall\u2019s \u2327 counts\nthe number of agreeing pairs minus the number of disagreeing pairs between two rankings, scaled\nby the total number of pairs, so that it takes values between -1 and 1. Experiments were performed\nwith n = 100 and reported Kendall \u2327 values were averaged over 50 experiments, with standard\ndeviation less than 0.02 for points of interest (i.e. here with Kendall \u2327> 0.8).\n\n\u03c4\n\nl\nl\na\nd\nn\ne\nK\n\n\u03c4\n\nl\nl\na\nd\nn\ne\nK\n\n1\n\n0.9\n\n0.8\n\n0.7\n\n0.6\n0\n\n \n\n1\n\n0.9\n\n0.8\n\n0.7\n\n0.6\n0\n\n \n\nSR\nPS\nRC\nBTL\n\n50\n\n100\n\n% corrupted\n\n50\n\n100\n\n% missing\n\n\u03c4\n\nl\nl\na\nd\nn\ne\nK\n\n\u03c4\n\nl\nl\na\nd\nn\ne\nK\n\n1\n\n0.9\n\n0.8\n\n0.7\n\n0.6\n0\n\n1\n\n0.9\n\n0.8\n\n0.7\n\n0.6\n0\n\n50\n\n100\n\n% missing\n\n50\n\n100\n\nRange m\n\nFigure 2: Kendall \u2327 (higher is better) for SerialRank (SR, full red line), row-sum (PS, [Wauthier\net al., 2013] dashed blue line), rank centrality (RC [Negahban et al., 2012] dashed green line), and\nmaximum likelihood (BTL [Bradley and Terry, 1952], dashed magenta line). In the \ufb01rst synthetic\ndataset, we vary the proportion of corrupted comparisons (top left), the proportion of observed com-\nparisons (top right) and the proportion of observed comparisons, with 20% of comparisons being\ncorrupted (bottom left). We also vary the parameter m in the second synthetic dataset (bottom right).\nReal Datasets The \ufb01rst real dataset consists of pairwise comparisons derived from outcomes in\nthe TopCoder algorithm competitions. We collected data from 103 competitions among 2742 coders\nover a period of about one year. Pairwise comparisons are extracted from the ranking of each com-\npetition and then averaged for each pair. TopCoder maintains ratings for each participant, updated\nin an online scheme after each competition, which were also included in the benchmarks. To mea-\nsure performance in Figure 3, we compute the percentage of upsets (i.e. comparisons disagreeing\nwith the computed ranking), which is closely related to the Kendall \u2327 (by an af\ufb01ne transformation if\ncomparisons were coming from a consistent ranking). We re\ufb01ne this metric by considering only the\nparticipants appearing in the top k, for various values of k, i.e. computing\n\nlk =\n\n1\n\n|Ck| Xi,j2Ck\n\n{r(i)>r(j)} {Ci,j <0},\n\n7\n\n(8)\n\n\fwhere C are the pairs (i, j) that are compared and such that i, j are both ranked in the top k, and r(i)\nis the rank of i. Up to scaling, this is the loss considered in [Kenyon-Mathieu and Schudy, 2007].\n\nk\np\no\nt\nn\ni\n\ns\nt\ne\ns\np\nu\n%\n\n0.45\n\n0.4\n\n0.35\n\n0.3\n\n0.25\n\n \n\n \n\nTopCoder\nPS\nRC\nBTL\nSR\n\n500 1000 1500 2000 2500\n\nk\n\n1\n0.9\n0.8\n0.7\n0.6\n0.5\n0.4\n0.3\n\n \n\nk\np\no\nt\nn\ni\n\ns\nt\ne\ns\np\nu\n%\n\n \n\nOfficial\nPS\nRC\nBTL\nSR\nSemi-sup.\n\n5\n\n10\nk\n\n15\n\n20\n\nFigure 3: Percentage of upsets (i.e. disagreeing comparisons, lower is better) de\ufb01ned in (8), for\nvarious values of k and ranking methods, on TopCoder (left) and football data (right).\n\nSemi-Supervised Ranking We illustrate here how, in a semi-supervised setting, one can interac-\ntively enforce some constraints on the retrieved ranking, using e.g.\nthe semi-supervised seriation\nalgorithm in [Fogel et al., 2013]. We compute rankings of England Football Premier League teams\nfor season 2013-2014 (cf. \ufb01gure 4 in Appendix for previous seasons). Comparisons are de\ufb01ned as\nthe averaged outcome (win, loss, or tie) of home and away games for each pair of teams. As shown\nin Table 1, the top half of SerialRank ranking is very close to the of\ufb01cial ranking calculated by\nsorting the sum of points for each team (3 points for a win, 1 point for a tie). However, there are\nsigni\ufb01cant variations in the bottom half, though the number of upsets is roughly the same as for\nthe of\ufb01cial ranking. To test semi-supervised ranking, suppose for example that we are not satis\ufb01ed\nwith the ranking of Aston Villa (last team when ranked by the spectral algorithm), we can explicitly\nenforce that Aston Villa appears before Cardiff, as in the of\ufb01cial ranking. In the ranking based on\nthe semi-supervised corresponding seriation problem, Aston Villa is not last anymore, though the\nnumber of disagreeing comparisons remains just as low (cf. Figure 3, right).\n\nTable 1: Ranking of teams in the England premier league season 2013-2014.\n\nOf\ufb01cial\nMan City (86)\nLiverpool (84)\nChelsea (82)\nArsenal (79)\nEverton (72)\nTottenham (69)\nMan United (64)\nSouthampton (56)\nStoke (50)\nNewcastle (49)\nCrystal Palace (45)\nSwansea (42)\nWest Ham (40)\nAston Villa (38)\nSunderland (38)\nHull (37)\nWest Brom (36)\nNorwich (33)\nFulham (32)\nCardiff (30)\n\nRow-sum\nMan City\nLiverpool\nChelsea\nArsenal\nEverton\nTottenham\nMan United\nSouthampton\nStoke\nNewcastle\nCrystal Palace\nSwansea\nWest Brom\nWest Ham\nAston Villa\nSunderland\nHull\nNorwich\nFulham\nCardiff\n\nRC\nLiverpool\nArsenal\nMan City\nChelsea\nEverton\nTottenham\nMan United\nSouthampton\nStoke\nNewcastle\nSwansea\nCrystal Palace\nWest Ham\nHull\nAston Villa\nWest Brom\nSunderland\nFulham\nNorwich\nCardiff\n\nBTL\nMan City\nLiverpool\nChelsea\nArsenal\nEverton\nTottenham\nMan United\nSouthampton\nStoke\nNewcastle\nCrystal Palace\nSwansea\nWest Brom\nWest Ham\nAston Villa\nSunderland\nHull\nNorwich\nFulham\nCardiff\n\nSerialRank\nMan City\nChelsea\nLiverpool\nArsenal\nEverton\nTottenham\nSouthampton\nMan United\nStoke\nSwansea\nNewcastle\nWest Brom\nHull\nWest Ham\nCardiff\nCrystal Palace\nFulham\nNorwich\nSunderland\nAston Villa\n\nSemi-Supervised\nMan City\nChelsea\nLiverpool\nEverton\nArsenal\nTottenham\nMan United\nSouthampton\nNewcastle\nStoke\nWest Brom\nSwansea\nCrystal Palace\nHull\nWest Ham\nFulham\nNorwich\nSunderland\nAston Villa\nCardiff\n\nAcknowledgments FF, AA and MV would like to acknowledge support from a European Re-\nsearch Council starting grant (project SIPA) and support from the MSR-INRIA joint centre.\n\n8\n\n\fReferences\nAilon, N. [2011], Active learning ranking from pairwise preferences with almost optimal query\n\ncomplexity., in \u2018NIPS\u2019, pp. 810\u2013818.\n\nAtkins, J., Boman, E., Hendrickson, B. et al. [1998], \u2018A spectral algorithm for seriation and the\n\nconsecutive ones problem\u2019, SIAM J. Comput. 28(1), 297\u2013310.\n\nBlum, A., Konjevod, G., Ravi, R. and Vempala, S. [2000], \u2018Semide\ufb01nite relaxations for minimum\n\nbandwidth and other vertex ordering problems\u2019, Theoretical Computer Science 235(1), 25\u201342.\n\nBradley, R. A. and Terry, M. E. [1952], \u2018Rank analysis of incomplete block designs: I. the method\n\nof paired comparisons\u2019, Biometrika pp. 324\u2013345.\n\nFeige, U. and Lee, J. R. [2007], \u2018An improved approximation ratio for the minimum linear arrange-\n\nment problem\u2019, Information Processing Letters 101(1), 26\u201329.\n\nFogel, F., Jenatton, R., Bach, F. and d\u2019Aspremont, A. [2013], \u2018Convex relaxations for permutation\n\nproblems\u2019, NIPS 2013, arXiv:1306.4805 .\n\nFreund, Y., Iyer, R., Schapire, R. E. and Singer, Y. [2003], \u2018An ef\ufb01cient boosting algorithm for\n\ncombining preferences\u2019, The Journal of machine learning research 4, 933\u2013969.\n\nHerbrich, R., Minka, T. and Graepel, T. [2006], TrueskillTM: A bayesian skill rating system, in\n\n\u2018Advances in Neural Information Processing Systems\u2019, pp. 569\u2013576.\n\nHuber, P. J. [1963], \u2018Pairwise comparison and ranking: optimum properties of the row sum proce-\n\ndure\u2019, The annals of mathematical statistics pp. 511\u2013520.\n\nHunter, D. R. [2004], \u2018MM algorithms for generalized bradley-terry models\u2019, Annals of Statistics\n\npp. 384\u2013406.\n\nJamieson, K. G. and Nowak, R. D. [2011], Active ranking using pairwise comparisons., in \u2018NIPS\u2019,\n\nVol. 24, pp. 2240\u20132248.\n\nJoachims, T. [2002], Optimizing search engines using clickthrough data, in \u2018Proceedings of the\neighth ACM SIGKDD international conference on Knowledge discovery and data mining\u2019, ACM,\npp. 133\u2013142.\n\nKeener, J. P. [1993], \u2018The perron-frobenius theorem and the ranking of football teams\u2019, SIAM review\n\n35(1), 80\u201393.\n\nKendall, M. G. and Smith, B. B. [1940], \u2018On the method of paired comparisons\u2019, Biometrika 31(3-\n\n4), 324\u2013345.\n\nKenyon-Mathieu, C. and Schudy, W. [2007], How to rank with few errors, in \u2018Proceedings of the\n\nthirty-ninth annual ACM symposium on Theory of computing\u2019, ACM, pp. 95\u2013103.\n\nKleinberg, J. [1999], \u2018Authoritative sources in a hyperlinked environment\u2019, Journal of the ACM\n\n46, 604\u2013632.\n\nKuczynski, J. and Wozniakowski, H. [1992], \u2018Estimating the largest eigenvalue by the power and\n\nLanczos algorithms with a random start\u2019, SIAM J. Matrix Anal. Appl 13(4), 1094\u20131122.\n\nLuce, R. [1959], Individual choice behavior, Wiley.\nNegahban, S., Oh, S. and Shah, D. [2012], Iterative ranking from pairwise comparisons., in \u2018NIPS\u2019,\n\npp. 2483\u20132491.\n\nPage, L., Brin, S., Motwani, R. and Winograd, T. [1998], \u2018The pagerank citation ranking: Bringing\n\norder to the web\u2019, Stanford CS Technical Report .\n\nSchapire, W. W. C. R. E. and Singer, Y. [1998], Learning to order things, in \u2018Advances in Neural\nInformation Processing Systems 10: Proceedings of the 1997 Conference\u2019, Vol. 10, MIT Press,\np. 451.\n\nWauthier, F. L., Jordan, M. I. and Jojic, N. [2013], Ef\ufb01cient ranking from pairwise comparisons, in\n\n\u2018Proceedings of the 30th International Conference on Machine Learning (ICML)\u2019.\n\n9\n\n\f", "award": [], "sourceid": 574, "authors": [{"given_name": "Fajwel", "family_name": "Fogel", "institution": "\u00c9cole Polytechnique"}, {"given_name": "Alexandre", "family_name": "d'Aspremont", "institution": "CNRS - ENS"}, {"given_name": "Milan", "family_name": "Vojnovic", "institution": "Microsoft Research"}]}