{"title": "On Sample Complexity Upper and Lower Bounds for Exact Ranking from Noisy Comparisons", "book": "Advances in Neural Information Processing Systems", "page_first": 10014, "page_last": 10024, "abstract": "This paper studies the problem of finding the exact ranking from noisy comparisons. A noisy comparison over a set of $m$ items produces a noisy outcome about the most preferred item, and reveals some information about the ranking. By repeatedly and adaptively choosing items to compare, we want to fully rank the items with a certain confidence, and use as few comparisons as possible. Different from most previous works, in this paper, we have three main novelties: (i) compared to prior works, our upper bounds (algorithms) and lower bounds on the sample complexity (aka number of comparisons) require the minimal assumptions on the instances, and are not restricted to specific models; (ii) we give lower bounds and upper bounds on instances with \\textit{unequal} noise levels; and (iii) this paper aims at the \\textit{exact} ranking without knowledge on the instances, while most of the previous works either focus on approximate rankings or study exact ranking but require prior knowledge. We first derive lower bounds for pairwise ranking (i.e., compare two items each time), and then propose (nearly) \\textit{optimal} pairwise ranking algorithms. We further make extensions to listwise ranking (i.e., comparing multiple items each time). Numerical results also show our improvements against the state of the art.", "full_text": "On Sample Complexity Upper and Lower Bounds for\n\nExact Ranking from Noisy Comparisons\n\nDepartment of Computer Science & Engineering\n\nDepartment of Computer Science\n\nWenbo Ren\n\nThe Ohio State University\n\nren.453@osu.edu\n\nJia Liu\n\nIowa State University\njialiu@iastate.edu\n\nDepartment of Electrical & Computer Engineering and Computer Science & Engineering\n\nNess B. Shroff\n\nThe Ohio State University\n\nshroff.11@osu.edu\n\nAbstract\n\nThis paper studies the problem of \ufb01nding the exact ranking from noisy comparisons.\nA noisy comparison over a set of m items produces a noisy outcome about the most\npreferred item, and reveals some information about the ranking. By repeatedly\nand adaptively choosing items to compare, we want to fully rank the items with\na certain con\ufb01dence, and use as few comparisons as possible. Different from\nmost previous works, in this paper, we have three main novelties: (i) compared\nto prior works, our upper bounds (algorithms) and lower bounds on the sample\ncomplexity (aka number of comparisons) require the minimal assumptions on the\ninstances, and are not restricted to speci\ufb01c models; (ii) we give lower bounds and\nupper bounds on instances with unequal noise levels; and (iii) this paper aims at\nthe exact ranking without knowledge on the instances, while most of the previous\nworks either focus on approximate rankings or study exact ranking but require prior\nknowledge. We \ufb01rst derive lower bounds for pairwise ranking (i.e., compare two\nitems each time), and then propose (nearly) optimal pairwise ranking algorithms.\nWe further make extensions to listwise ranking (i.e., comparing multiple items each\ntime). Numerical results also show our improvements against the state of the art.\n\n1\n\nIntroduction\n\nBackground and motivation: Ranking from noisy comparisons has been a canonical problem in\nthe machine learning community, and has found applications in various areas such as social choices\n[8], web search [9], crowd sourcing [4], and recommendation systems [3]. The main goal of ranking\nproblems is to recover the full or partial rankings of a set of items from noisy comparisons. The\nitems can refer to various things, such as products, movies, pages, and advertisements, and the\ncomparisons refer to tests or queries about the items\u2019 strengths or the users\u2019 preferences. In this paper,\nwe use words \u201citem\u201d, \u201ccomparison\u201d and \u201cpreference\u201d for simplicity. A comparison involves two\n(i.e., pairwise) or multiple (i.e., listwise) items, and returns a noisy result about the most preferred\none, where \u201cnoisy\u201d means that the comparison outcome is random and the returned item may not\nbe the most preferred one. A noisy comparison reveals some information about the ranking of\nthe items. This information can be used to describe users\u2019 preferences, which helps applications\nsuch as recommendations, decision making, and advertising, etc. One example is e-commerce: A\nuser\u2019s click or purchase of a product (but not others) is based on a noisy (due to the lack of full\ninformation) comparison between several similar products, and one can rank the products based\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fitem as the winner, which makes(cid:80)\n\non the noisy outcomes of the clicks or the purchases to give better recommendations. Due to the\nwide applications, in this paper, we do not focus on speci\ufb01c applications and regard comparisons as\nblack-box procedures.\nThis paper studies the active (or adaptive) ranking, where the learner adaptively chooses items to\ncompare based on previous comparison results, and returns a ranking when having enough con\ufb01dence.\nPrevious works [4, 28] have shown that, compared to non-adaptive ranking, active ranking can\nsigni\ufb01cantly reduce the number of comparisons needed and achieve a similar con\ufb01dence or accuracy.\nIn some applications such as news apps, the servers are able to adaptively choose news to present\nto the users and collect feedbacks, by which they can learn the users\u2019 preferences in shorter time\ncompared to non-adaptive methods and may provide better user experience.\nWe focus on the active full ranking problem, that is, to \ufb01nd the exact full ranking with a certain\ncon\ufb01dence level by adaptively choosing the items to compare, and try to use as few comparisons as\npossible. The comparisons can be either pairwise (i.e., comparing two items each time) or listwise\n(i.e., comparing more than two items each time). We are interested in the upper and lower bounds on\nthe sample complexity (aka number of comparisons needed). We are also interested in understanding\nwhether using listwise comparisons can reduce the sample complexity.\nModels and problem statement: There are n items in total, indexed by 1, 2, 3, ..., n. Given a\ncomparison over a set S, each item i \u2208 S has pi,S probability to be returned as the most preferred\none (also referred to as i \u201cwins\u201d this comparison), and when a tie happens, we randomly assign one\ni\u2208S pi,S = 1 for all set S \u2282 [n]. When |S| = 2, we say this\ncomparison is pairwise, and when |S| > 2, we say listwise. In this paper, a comparison is said to be\nm-wise if it involves exactly m items (i.e., |S| = m). For m = 2 and a two-sized set S = {i, j}, to\nsimplify notation, we de\ufb01ne pi,j := pi,S and pj,i := pj,S.\nAssumptions. In this paper, we make the following assumptions: A1) Comparisons are independent\nacross items, sets, and time. We note that the assumption of independence is common in the this area\n(e.g., [10, 11, 12, 15, 16, 22, 31, 32, 34, 35]). A2) There is a unique permutation (r1, r2, ..., rn) of [n]\n1 such that r1 (cid:31) r2 (cid:31)\u00b7\u00b7\u00b7 (cid:31) rn, where i (cid:31) j denotes that i ranks higher than j (i.e., i is more preferred\nthan j). We refer to this unique permutation as the true ranking or exact ranking, and our goal is to\nrecover the true ranking; A3) For any set S and item i \u2208 S, if i ranks higher than all other items\nk of S, then pi,S > pk,S. For pairwise comparisons, A3 states that i (cid:31) j if and only if pi,j > 1/2.\nWe note that for pairwise comparisons, A3 can be viewed as the weak stochastic transitivity [33].\nThe three assumptions are necessary to make the exact ranking (i.e., \ufb01nding the unique true ranking)\nproblem meaningful, and thus, we say our assumptions are minimal. Except for the above three\nassumptions, we do not assume any prior knowledge of the pi,S values. We note that any comparison\nmodel can be fully described by the comparison probabilities (pi,S : i \u2208 S, S \u2282 [n]).\nWe further de\ufb01ne some notations. Two items i and j are said to be adjacent if in the true ranking,\nthere does not exist an item k such that i (cid:31) k (cid:31) j or j (cid:31) k (cid:31) i. For all items i and j in [n], de\ufb01ne\n\u2206i,j := |pi,j \u2212 1/2|, \u2206i := minj(cid:54)=i \u2206i,j, and \u02dc\u2206i := min{\u2206i,j : i and j are adjacent}. We adopt the\nnotion of strong stochastic transitivity (SST) [11]: for all items i, j, and k satisfying i (cid:31) j (cid:31) k, it\nholds that pi,k \u2265 max{pi,j, pj,k}. Under the SST condition, we have \u2206i = \u02dc\u2206i for all items i. We\nnote that this paper is not restricted to the SST condition. Pairwise (listwise) ranking refers to ranking\nfrom pairwise (listwise) comparisons. In this paper, f (cid:22) g means f = O(g), f (cid:23) g means f = \u2126(g),\nand f (cid:39) g means f = \u0398(g). The meanings of O(\u00b7), o(\u00b7), \u2126(\u00b7), \u03c9(\u00b7), and \u0398(\u00b7) are standard in the\ni,j , i (cid:54)= j)). For any\nsense of Bachmann-Landau notation with respect to (n, \u03b4\u22121, \u0001\u22121, \u2206\u22121, \u03b7\u22121, (\u2206\u22121\na, b \u2208 R, de\ufb01ne a \u2227 b := min{a, b} and a \u2228 b := max{a, b}.\nProblem (Exact ranking). Given \u03b4 \u2208 (0, 1/2) and n items, one wants to determine the true ranking\nwith probability at least 1 \u2212 \u03b4 by adaptively choosing sets of items to compare.\nDe\ufb01nition 1 (\u03b4-correct algorithms). An algorithm is said to be \u03b4-correct for a problem if for any\ninput instance of this problem, it, with probability at least 1 \u2212 \u03b4, returns a correct result in \ufb01nite time.\nMain results: First, for \u03b4-correct pairwise ranking algorithms with no prior knowledge of the\ni +\nlog(n/\u03b4))) 2, which is shown to be tight (up to constant factors) under SST and some mild conditions.\n\ninstances, we derive a sample-complexity lower bound of the form \u2126((cid:80)\n\ni\u2208[n] \u2206\u22122\n\ni\n\n(log log \u2206\u22121\n\n1For any positive integer k, de\ufb01ne [k] := {1, 2, ..., k} to simplify notation\n2All log in this paper, unless explicitly noted, are natural log.\n\n2\n\n\fSecond, for pairwise and listwise ranking under the multinomial logit (MNL) model, we derive a\nmodel-speci\ufb01c lower bound, which is tight (up to constant factors) under some mild conditions, and\nshows that in the worst case, the listwise lower bound is no lower than the pairwise one.\nThird, we propose a pairwise ranking algorithm that requires no prior information and minimal\nassumptions on the instances, and its sample-complexity upper bound matches the lower bounds\nproved in this paper under the SST condition and some mild conditions, implying that both upper and\nlower bounds are optimal.\n\n2 Related works\n\nDating back to 1994, the authors of [14] studied the noisy ranking under the strict constraint that\npi,j \u2265 1/2 + \u2206 for any i (cid:31) j, where \u2206 > 0 is priorly known. They showed that any \u03b4-correct\nalgorithm needs \u0398(n\u2206\u22122 log(n/\u03b4)) comparisons for the worst instances. However, in some cases, it\nis impossible to either assume the knowledge of \u2206 or require pi,j \u2265 1/2 + \u2206 for any i (cid:31) j. Also,\ntheir bounds only depend on the minimal gap \u2206 but not \u2206i,j\u2019s or \u2206i\u2019s, and hence is not tight in most\ncases. In contrast, our algorithms require no knowledge on the gaps (i.e., \u2206i,j\u2019s), and we establish\nsample-complexity lower bounds and upper bounds that base on unequal gaps, which can be much\ntighter when \u2206i\u2019s vary a lot.\nAnother line of research is to explore the probably approximately correct (PAC) ranking (which aims\nat \ufb01nding a permutation (r1, r2, ..., rn) of [n] such that pri,rj \u2265 1/2 \u2212 \u0001 for all i < j, where \u0001 > 0 is\na given error tolerance) under various pairwise comparison models [10, 11, 12, 29, 31, 32, 35]. When\n\u0001 > 0, the PAC ranking may not be unique. The authors of [10, 11, 12] proposed algorithms with\nO(n\u0001\u22122 log(n/\u03b4)) upper bound for PAC ranking with tolerance \u0001 > 0 under SST and the stochastic\ntriangle inequality3 (STI). When \u0001 goes to zero, the PAC ranking reduces to the true ranking. However,\nwhen \u0001 > 0, we still need some prior knowledge on (pi,j : i, j \u2208 [n]) to get the true ranking, as\nwe need to know a lower bound of the values of \u2206i,j to ensure that the PAC ranking equals to the\nunique true ranking. When \u0001 = 0, the algorithms in [10, 11, 12] do not work. Prior to these works,\nthe authors of [35] also studied the PAC ranking. In their work, with \u0001 = 0, the unique true ranking\n)}) comparisons, which is higher than the\ncan be found by O(n log n \u00b7 maxi\u2208[n]{\u2206\u22122\nlower bound and upper bound proved in this paper by at least a log factor.\nIn contrast, this paper is focused on recovering the unique true (exact) ranking, and there are three\nmajor motivations. First, in some applications, we prefer to \ufb01nd the exact order, especially in\n\u201cwinner-takes-all\u201d situations. For example, when predicting the winner of an election, we prefer to get\nthe exact result but not the PAC one, as only a few votes can completely change the result. Second,\nanalyzing the exact ranking can help us better understand the instance-wise upper and lower bounds\nabout the ranking problems, while the bounds of PAC ranking (e.g., in [10, 11, 12]) may only work\nfor the worst cases. Third, exact ranking algorithms may better exploit the large gaps (e.g., \u2206i\u2019s) to\nachieve lower sample complexities. In fact, when \ufb01nding the PAC ranking, we can perform the exact\nranking algorithm and the PAC ranking algorithm parallelly, and return a ranking whenever one of\nthem returns. By this, when \u0001 is large, we can bene\ufb01t from the PAC upper bounds that depend on \u0001\u22122,\nand when \u0001 is small, we can bene\ufb01t from the exact ranking bounds that depend on \u2206\u22122\nThere are also other interesting active ranking works. Authors of [15, 16, 22, 34] studied active\nranking under the Borda-Score model, where the Borda-Score of item i is de\ufb01ned as\nj(cid:54)=i pi,j.\nWe note that the Borda-Score model does not satisfy A2 and A3 and is not comparable with the model\nin this paper. There are also many works on best item(s) selection, including [1, 5, 7, 19, 26, 27, 32],\nwhich are less related to this paper.\n\nlog(n\u03b4\u22121\u2206\u22121\n\n(cid:80)\n\ni\n\ni\n\n.\n\ni\n\n1\nn\u22121\n\n3 Lower bound analysis\n\n3.1 Generic lower bound for \u03b4-correct algorithms\n\nIn this subsection, we establish a sample-complexity lower bound for pairwise ranking. The lower\nbound is for \u03b4-correct algorithms, which have performance guarantee for all input instances. There\nare algorithms that work faster than our lower bound but only return correct results with 1 \u2212 \u03b4\n\n3Stochastic triangle inequality means that for all items i, j, k with i (cid:31) j (cid:31) k, \u2206i,k \u2264 \u2206i,j + \u2206j,k.\n\n3\n\n\fcon\ufb01dence for a restricted class of instances, which is discussed in Section A.1 of supplementary\nmaterial. Theorem 2 states the lower bound, and its full proof is provided in supplementary material.\nHere we remind that \u02dc\u2206i := min{\u2206i,j : i and j are adjaent}.\nTheorem 2 (Lower bound for pairwise ranking). Given \u03b4 \u2208 (0, 1/12) and an instance I with n\nitems, then the number of comparisons used by a \u03b4-correct algorithm A with no prior knowledge\nabout the gaps of I is lower bounded by\n\n[ \u02dc\u2206\u22122\n\n(log log \u02dc\u2206\u22121\n\ni\n\n\u02dc\u2206\u22122\n\ni\n\nlog(1/xi) :\n\n\u2126(cid:0)(cid:88)\n\ni\u2208[n]\n\nIf \u03b4 (cid:22) 1/poly(n)4, or maxi,j\u2208[n]{ \u02dc\u2206i/ \u02dc\u2206j} (cid:22) n1/2\u2212p for some constant p > 0, then the lower bound\nbecomes\n\ni + log(1/\u03b4))] + min{(cid:88)\n\u2126(cid:0)(cid:88)\n\n(log log \u02dc\u2206\u22121\n\n\u02dc\u2206\u22122\n\ni\u2208[n]\n\ni\n\ni\u2208[n]\n\ni + log(n/\u03b4))(cid:1).\n\nxi \u2264 1}(cid:1).\n\n(cid:88)\n\ni\u2208[n]\n\n(1)\n\n(2)\n\nRemark: (i) When the instance satis\ufb01es the SST condition (the algorithm does not need to know\nthis information), the bound in Eq. (2) is tight (up to a constant factor) under the given condition,\nwhich will be shown in Theorem 12 later. (ii) The lower bound in Eq. (1) implies an n log n term in\nmin{\u00b7}, which can be checked by the convexity of log(1/xi) and Jensen\u2019s inequality, which yields\ni\u2208[n] xi) \u2265 n log n. (iii) The lower bound in (2) may not hold if the\nrequired conditions do not hold, which will be discussed in Section A.2 of supplementary material.\n\ni\u2208[n] log(1/xi) \u2265 n log(n/(cid:80)\n(cid:80)\n\nProof sketch of Theorem 2. Due to space limitation, we outline the basic idea of the proof here and\nrefer readers to supplementary material for details. Our \ufb01rst step is to use the results in [13, 18, 25]\nto establish a lower bound for ranking two items. Then, it seems straightforward that the lower\nbound for ranking n items can be obtained by summing up the lower bounds for ranking {q1, q2},\n{q2, q3},...,{qn\u22121, qn}, where q1 (cid:31) q2 (cid:31) \u00b7\u00b7\u00b7 (cid:31) qn is the true ranking. However, Note that to rank\nqi and qj, there may be an algorithm that compares qi and qj with other items like qk, and uses\nthe comparison outcomes over {qi, qk} and {qj, qk} to determine the order of qi and qj. Since it is\nunclear to what degree comparing qi and qj with other items can help to rank qi and qj, the lower\nbound for ranking n items cannot be simply obtained by summing up the lower bounds for ranking\n2 items. To overcome this challenge, our strategy is to construct two problems: P1 and P2 with\ndecreasing in\ufb02uence of this type of comparisons. Then, we prove that P1 reduces to exact ranking\nand P2 reduces to P1. Third, we prove a lower bound on \u03b4-correct algorithms for solving P2, which\nyields a lower bound for exact ranking. Finally, we use this lower bound to get the desired lower\nbounds in Eq. (1) and Eq. (2).\n\n3.2 Model-speci\ufb01c lower bound\n\nIn Section 3.1, we provide a lower bound for \u03b4-correct algorithms that do not require any knowledge\nof the instances except assumptions A1 to A3. However, in some applications, people may focus on a\nspeci\ufb01c model, and hence, the algorithm may have further knowledge about the instances, such as the\nmodel\u2019s restrictions. Hence, the lower bound in Theorem 2 may not be applicable any more5.\nIn this paper, we derive a model-speci\ufb01c lower bound for the MNL model. The MNL model can be\napplied to both pairwise and listwise comparisons. For pairwise comparisons, the MNL model is\nmathematically equivalent to the Bradley-Terry-Luce (BTL) model [24] and the Plackett-Luce (PL)\nmodel [35]. There have been many prior works that focus on active ranking based on this model (e.g.,\n[5, 6, 7, 15, 19, 27, 31, 35]).\nUnder the MNL model, each item holds a real number representing the users\u2019 preference over this\nitem, where the larger the number, the more preferred the item. Speci\ufb01cally, each item i holds a\nj\u2208S exp(\u03b3j). To simplify\nj\u2208S \u03b8j. We name \u03b8i as the preference score of\nitem i. We de\ufb01ne \u2206i,j := |pi,j \u2212 1/2|, \u2206i := minj(cid:54)=i \u2206i,j, and we have \u02dc\u2206i = \u2206i, i.e., the MNL\nmodel satis\ufb01es the SST condition.\n\nparameter \u03b3i \u2208 R such that for any set S containing i, pi,S = exp(\u03b3i)/(cid:80)\nnotation, we let \u03b8i = exp(\u03b3i), hence, pi,S = \u03b8i/(cid:80)\n\n4poly(n) means a polynomial function of n, and \u03b4 (cid:22) 1/poly(n) means \u03b4 (cid:22) n\u2212p for some constant p > 0.\n5For example, under a model with \u2206i,j = \u2206 for any i (cid:54)= j where \u2206 > 0 is unknown, one may \ufb01rst estimate\na lower bound of \u2206, and then perform algorithms in [14], yielding a sample complexity lower than Theorem 2.\n\n4\n\n\fTheorem 3. [Lower bound for the MNL model] Let \u03b4 \u2208 (0, 1/12) and given a \u03b4-correct algorithm\nA with the knowledge that the input instances satisfy the MNL model, let NA be the number of\ncomparisons conducted by A, then E[NA] is lower bounded by Eq. (1) with a different hidden\nconstant factor. When \u03b4 (cid:22) 1/poly(n) or maxi,j\u2208[n]{\u2206i/\u2206j} (cid:22) n1/2\u2212p for some constant p > 0,\nthe sample complexity is lower bounded by Eq. (2) with a different hidden constant factor.\n\nProof sketch. We prove this theorem by Lemmas 4, 5 and 6, which could be of independent interest.\nSuppose that there are two coins with unknown head probabilities (the probability that a toss produces\na head) \u03bb and \u00b5, respectively, and we want to \ufb01nd the more biased one (i.e., the one with the larger\nhead probability). Lemma 4 states a lower bound on the number of heads or tails generated for \ufb01nding\nthe more biased coin, which works even if \u03bb and \u00b5 go to 0. This is in contrast to the lower bounds on\nthe number of tosses given by previous works [18, 21, 25], which go to in\ufb01nity as \u03bb and \u00b5 go to 0.\nLemma 4 (Lower bound on number of heads). Let \u03bb + \u00b5 \u2264 1, \u2206 := |\u03bb/(\u03bb + \u00b5) \u2212 1/2|, and\n\u03b4 \u2208 (0, 1/2) be given. To \ufb01nd the more biased coin with probability 1 \u2212 \u03b4, any \u03b4-correct algorithm\nfor this problem produces \u2126(\u2206\u22122(log log \u2206\u22121 + log \u03b4\u22121)) heads in expectation.\n\ni := minj(cid:54)=i \u2206c\n\ni,j = \u2206i,j, and \u2206i = \u02dc\u2206i = \u2206c\ni .\n\nNow we consider n coins C1, C2, ..., Cn with mean rewards \u00b51, \u00b52, ..., \u00b5n, respectively, where for any\ni \u2208 [n], \u03b8i/\u00b5i = c for some constant c > 0. De\ufb01ne the gaps of coins \u2206c\ni,j := |\u00b5i/(\u00b5i + \u00b5j) \u2212 1/2|,\nand \u2206c\nLemma 5 (Lower bound for arranging coins). For \u03b4 < 1/12, to arrange these coins in ascending\norder of head probabilities, the number of heads generated by any \u03b4-correct algorithm is lower\nbounded by Eq. (1) with a (possibly) different hidden constant factor.\n\ni,j. We can check that for all i and j, \u2206c\n\nThe next lemma shows that any algorithm that solves a ranking problem under the MNL model can\nbe transformed to solve the pure exploration multi-armed bandit (PEMAB) problem with Bernoulli\nrewards(e.g., [18, 20, 30]). Previous works [1, 15, 16] have shown that certain types of pairwise\nranking problems (e.g., Borda-Score ranking) can also be transformed to PEMAB problems. But in\nthis paper, we make a reverse connection that bridges these two classes of problems, which may be\nof independent interest. We note that in our prior work [29], we proved a similar result.\nLemma 6 (Reducing PEMAB problems to ranking). If there is a \u03b4-correct algorithm that correctly\nranks [n] with probability 1\u2212\u03b4 by M expected number of comparisons, then we can construct another\n\u03b4-correct algorithm that correctly arranges the coins C1, C2, ..., Cn in the order of ascending head\nprobabilities with probability 1 \u2212 \u03b4 and produces M heads in expectation.\n\nThe theorem follows by Lemmas 5 and 6. A full proof can be found in supplementary material.\n\n3.3 Discussions on listwise ranking\n\nA listwise comparison compares m (m > 2) items and returns a noisy result about the most preferred\nitem. It is an interesting question whether exact ranking from listwise comparisons requires less\ncomparisons. The answer is \u201cIt depends.\u201d When every comparison returns the most preferred\nitem with high probability (w.h.p.)6, then, by conducting m-wise comparisons, the number of\ncomparisons needed for exact ranking is \u0398(n logm n), i.e., there is a log m reduction, which is stated\nin Proposition 7. The proof can be found in supplementary material.\nProposition 7 (Listwise ranking with negligible noises). If all comparisons are correct w.h.p., to\nexactly rank n items w.h.p. by using m-wise comparisons, \u0398(n logm n) comparisons are needed.\nIn general, when the \u201cw.h.p. condition\u201d is violated, listwise ranking does not necessarily require less\ncomparisons than pairwise ranking (in order sense). Here, we give an example. For more general\nmodels, it remains an open problem to identify the theoretical limits, which is left for future studies.\nTheorem 8. Under the MNL model, given n items with preference scores \u03b81, \u03b82, ..., \u03b8n and \u2206i,j :=\n|\u03b8i/(\u03b8i + \u03b8j)\u2212 1/2|, \u02dc\u2206i = \u2206i := minj(cid:54)=i \u2206i,j, to correctly rank these n items with probability 1\u2212 \u03b4,\neven with m-wise comparisons for all m \u2208 {2, 3, ..., n}, the lower bound is the same as the pairwise\nranking (i.e., Theorem 3) with (possibly) different hidden constant factors.\n\n6In this paper, \u201cw.h.p.\u201d means with probability at least 1 \u2212 n\u2212p, where p > 0 is a suf\ufb01ciently large constant.\n\n5\n\n\fTheorem 8 gives a minimax lower bound for listwise ranking, which is the same as pairwise ranking.\nThe proof is given in supplementary material. In [5], the authors have shown that for top-k item\nselection under the MNL model, listwise comparisons can reduce the number of comparisons needed\ncompared with pairwise comparisons. However, for exact ranking, listwise comparisons cannot.\n\n4 Algorithm and the upper bound for pairwise ranking\n\nIn this section, we establish a (nearly) sample-complexity optimal \u03b4-correct algorithm for exact\nranking, where whether the word \u201cnearly\u201d can be deleted depends on the structures of the instances.\nThe algorithm is based on Binary Search proposed in [14] with upper bound O(n\u2206\u22122\nmin log(n/\u03b4)),\nwhere \u2206min := mini(cid:54)=j \u2206i,j. Binary Search has two limitations: (i) it requires the knowledge of\n\u2206min a priori to run, and (ii) it does not utilize the unequal noise levels.\nIn this paper, we propose a technique named Attempting with error prevention and establish a\ncorresponding insertion subroutine that attempts to insert an item i into a sorted list with a guessing\n\u2206i-value, while preventing errors from happening if the guess is not well chosen. If the guess is small\nenough, this subroutine correctly inserts the item with a large probability, and if not, this subroutine\nwill, with a large probability, not insert the item into a wrong position. By attempting to insert item i\nwith diminishing guesses of \u2206i, this subroutine \ufb01nally correctly inserts item i with a large con\ufb01dence.\nTo implement the technique \u201cAttempting with error prevention\u201d, we \ufb01rst need to construct a useful\nsubroutine called Attempting-Comparison (ATC), which attempts to rank two items with \u0001, a guess\nof \u2206i,j. Then, by ATC, we establish Attempting-Insertion (ATI), which also adopts this technique.\n\n(cid:113) 1\n\n2t log \u03c02t2\n\n3\u03b4 ; bmax \u2190 (cid:100) 1\n\nSubroutine 1 Attempting-Comparison(i, j, \u0001, \u03b4) (ATC)\nInitialize: \u2200t, let bt =\n1: for t \u2190 1 to bmax do\nCompare i and j once; Update wi \u2190 wi + 1 if i wins; Update \u02c6pt\n2:\nif \u02c6pt\n3:\nif \u02c6pt\n4:\n5: end for\n6: return i if \u02c6pt\n\ni > 1/2 + bt then return i;\ni < 1/2 \u2212 bt then return j;\ni > 1/2; return j if \u02c6pt\n\n2\u00012 log 2\n\n\u03b4(cid:101); wi \u2190 0;\n\ni \u2190 wi/t;\n\ni < 1/2; and return a random item if \u02c6pt\n\ni = 1/2;\n\nLemma 9 (Theoretical Performance of ATC). ATC terminates after at most bmax = O(\u0001\u22122 log (1/\u03b4))\ncomparisons and returns the more preferred item with probability at least 1/2. Further, if \u0001 \u2264 \u2206i,j,\nthen ATC returns the more preferred item with probability at least 1 \u2212 \u03b4.\n\nNext, to establish insertion subroutine ATI, we introduce preference interval trees [14] (PIT). A PIT\nis constructed from a sorted list of items. For a sorted list of items S with size l, without loss of\ngenerality, we assume that r1 (cid:31) r2 (cid:31) \u00b7\u00b7\u00b7 (cid:31) rl. We introduce two arti\ufb01cial items \u2212\u221e and +\u221e,\nwhere \u2212\u221e is such that pi,\u2212\u221e = 1 for any item i, and +\u221e is such that pi,+\u221e = 0 for any item i.\nPreference Interval Tree [14]. A preference interval tree\nconstructed from the sorted list S satis\ufb01es the following con-\nditions: (i) It is a binary tree with depth (cid:100)1 + log2(|S| +\n1)(cid:101).\n(ii) Each node u holds an interval (u.left, u.right)\nwhere u.left, u.right \u2208 S \u222a {\u2212\u221e, +\u221e}, and if u is non-\nleaf, it holds an item u.mid satisfying u.right (cid:31) u.mid (cid:31)\nu.left.\n(iii) A node i is in the interval (j, k) if and only\nif k (cid:31) i (cid:31) j.\n(iv) The root node is with interval\n(\u2212\u221e, +\u221e). From left to right, the leaf nodes are with intervals\n(\u2212\u221e, rl), (rl, rl\u22121), (rl\u22121, rl\u22122), ..., (r2, r1), (r1, +\u221e).\n(v)\nEach non-leaf node u has two children u.lchild and u.rchild\nsuch that u.left = u.lchild.left, u.right = u.rchild.right and\nu.mid = u.lchild.right = u.rchild.left.\nBased on the notion of PIT, we present insertion subroutine ATI in Subroutine 2. ATI runs a random\nwalk on the PIT to insert i into S. Let X be the point that moves on the tree. We say a leaf u0 correct\n\nFigure 1: An example of PIT, con-\nstructed from a sorted list with three\nitems 3 (cid:31) 2 (cid:31) 1.\n\n6\n\n\fif the item i belongs to (u0.left, u0.right). De\ufb01ne d(X) := the distance (i.e., the number of edges)\nbetween X and u0. At each round of the subroutine, if all comparisons give correct results, we say\nthis round is correct, otherwise we say incorrect. For each correct round, either d(X) is decreased\nby 1 or the counter of u0 is increased by 1. The subroutine inserts i into u0 if u0 is counted for\n16 tmax times. Thus, after tmax rounds, the subroutine correctly inserts i into S if the number of\n1 + 5\n2 , where h = (cid:100)1 + log2(|S| + 1)(cid:101) is the depth of the tree.\ncorrect rounds is no less than 21\nIf guessing \u0001 \u2264 \u2206i, then each round is correct with probability at least q, making the subroutine\ncorrectly insert item i with probability at least 1 \u2212 \u03b4.\nFor all \u0001 > 0, each round is incorrect with probability at most 1/2, and thus, by concentration\ninequalities, we can also show that with probability at least 1 \u2212 \u03b4, i will not be placed into any leaf\nnode other than u0. That is, if \u0001 > \u2206i, the subroutine either correctly inserts i or returns unsure with\nprobability at least 1 \u2212 \u03b4. The choice of parameters guarantees the sample complexity. Lemma 10\nstates its theoretical performance, and the proof is relegated to the supplementary material.\n\n32 tmax + h\n\n3\u03b4 + 1 then\n\nelse\n\nq) = i then\n\n25 log 2\n\n\u03b4}(cid:101) and q \u2190 15\n16;\n\nif X is the root node then\n\n#i.e., ATC returns i (cid:31) X.mid\n\nq) = X.right then\n\ncX \u2190 cX + 1;\nif cX > bt := 1\n\nInsert i into the corresponding interval of X and return inserted;\n\n2 t +\n\n2 log \u03c02t2\nelse if cX > 0 then cX \u2190 cX \u2212 1\nelse X \u2190 X.parent\n\u221a\nif ATC(i, X.left, \u0001, 1 \u2212 3\nX \u2190 X.parent;\n\u221a\nelse if ATC(i, X.mid, \u0001, 1 \u2212 3\nelse X \u2190 X.lchild;\n\nif ATC(i, X.mid, \u0001, 1 \u2212 q) = i then X \u2190 X.right;\nelse X \u2190 X.left;\nelse if X is a leaf node then\nif ATC(i, X.left, \u0001, 1 \u2212 \u221a\nq) = i \u2227 ATC(i, X.right, \u0001, 1 \u2212 \u221a\n(cid:113) t\n\nSubroutine 2 Attempting-Insertion(i, S, \u0001, \u03b4) (ATI).\nInitialize: Let T be a PIT constructed from S;\nh \u2190 (cid:100)1 + log2(1 + |S|)(cid:101), the depth of T ;\nFor all leaf nodes u of T , initialize cu \u2190 0;\nSet tmax \u2190 (cid:100)max{4h, 512\n1: X \u2190 the root node of T ;\n2: for t \u2190 1 to tmax do\n3:\n4:\n5:\n6:\n7:\n8:\n9:\n10:\n11:\n12:\n13:\n14:\n15:\n16:\n17:\n18: end for\n19: if there is a leaf node u with cu \u2265 1 + 5\n20:\n21: else return unsure;\nLemma 10 (Theoretical performance of ATI). Let \u03b4 \u2208 (0, 1). ATI returns after O(\u0001\u22122 log(|S|/\u03b4))\ncomparisons and, with probability at least 1 \u2212 \u03b4, correctly inserts i or returns unsure. Further, if\n\u0001 \u2264 \u2206i, it correctly inserts i with probability at least 1 \u2212 \u03b4.\nBy Lemma 10, we can see that the idea \u201cAttempting with error prevention\u201d is successfully imple-\nmented. Thus, by repeatedly attempting to insert an item with diminishing guess \u0001 with proper\ncon\ufb01dences for the attempts, one can \ufb01nally correctly insert i with probability 1 \u2212 \u03b4. We use this idea\nto establish the insertion subroutine Iterative-Attempting-Insertion (IAI, Subroutine 3), and then use\nit to establish the ranking algorithm Iterative-Insertion-Ranking (IIR, Algorithm 4). Their theoretical\nperformances are stated in Lemma 11 and Theorem 12, respectively, and their proofs are given in\nsupplementary material.\nLemma 11 (Theoretical Performance of IAI). With probability at least 1 \u2212 \u03b4, IAI correctly inserts i\ninto S, and conducts at most O(\u2206\u22122\nTheorem 12 (Theoretical Performance of IIR). With probability at least 1 \u2212 \u03b4, IIR returns the exact\n\n\u221a\nq) = X.left \u2228 ATC(i, X.right, \u0001, 1 \u2212 3\n\nInsert i into the corresponding interval of u and return inserted;\n\n(log log \u2206\u22121\n\ni + log(|S|/\u03b4))) comparisons.\n\nranking of [n] and conducts at most O((cid:80)\n\ni\n\ni\u2208[n] \u2206\u22122\n\ni\n\n(log log \u2206\u22121\n\ni + log(n/\u03b4))) comparisons.\n\nq) = i then X \u2190 X.rchild;\n\n16 tmax then\n\n7\n\n\f\u03c02\u03c4 2 ; t \u2190 0; F lag \u2190 unsure;\nF lag \u2190ATI(i, S, \u0001t, \u03b4t);\n\nAlgorithm 4 Iterative-Insertion-Ranking (IIR).\nInput: S = [n], and con\ufb01dence \u03b4 > 0;\n1: Ans \u2190 the list containing only S[1];\n2: for t \u2190 2 to |S| do\n3:\n4: end for\n5: return Ans;\n\nSubroutine 3 Iterative-Attempting-Insertion (IAI).\nInput parameters: (i, S, \u03b4);\nInitialize: For all \u03c4 \u2208 Z+, set \u0001\u03c4 = 2\u2212(\u03c4 +1) and\n\u03b4\u03c4 = 6\u03b4\n1: repeat t \u2190 t + 1;\n2:\n3: until F lag = inserted\nRemark: We can see that the upper bounds of IIR depend on the values of (\u2206i, i \u2208 [n]) while the\nlower bounds given in Theorem 2 depend on the values of ( \u02dc\u2206i, i \u2208 [n]). Without SST, it is possible\n\u02dc\u2206i < \u2206i, but if SST holds, then our algorithm is optimal up to a constant factor given \u03b4 (cid:22) 1/poly(n),\n\u02dc\u2206i/ \u02dc\u2206j (cid:22) O(n1/2\u2212p) for some constant p > 0. According to [10, 11, 12], ranking\nor maxi,j\u2208[n]\nwithout the SST condition can be much harder than that with SST , and it remains an open problem\nwhether our upper bound is tight or not when the SST condition does not hold.\n\nIAI(S[t], Ans, \u03b4/(n \u2212 1));\n\n5 Numerical results\n\ni\n\nlog(n\u2206\u22121\n\nIn this section, we provide numerical results to demonstrate the ef\ufb01cacy of our proposed IIR algorithm.\nWe compare IIR with: (i) Active-Ranking (AR) [15], which focuses on the Borda-Score model and\nis not directly comparable to our algorithm. We use it as an example to show that although Borda-\nRanking may be the same as exact ranking, for \ufb01nding the exact ranking, the performance of\nBorda-Score algorithms is not always as good as that for \ufb01nding the Borda-Ranking 7; (ii) PLPAC-\nAMPR [35], an algorithm for PAC ranking under the MNL model. By setting the parameter \u0001 = 0,\nit can \ufb01nd the exact ranking with O((n log n) maxi\u2208[n] \u2206\u22122\ni \u03b4\u22121)) comparisons, higher\nthan our algorithm by at least a log factor; (iii) UCB + Binary Search of [14]. In the Binary Search\nalgorithm of [14], a subroutine that ranks two items with a constant con\ufb01dence is required. In [14], it\nassumes the value of \u2206min = mini\u2208[n] \u2206i is priorly known, and the subroutine is simply comparing\ntwo items for \u0398(\u2206\u22122\nmin) times and returns the item that wins more. In this paper, the value of \u2206min\nis not priorly known, and here, we use UCB algorithms such as LUCB [23] to play the role of the\nrequired subroutine. The UCB algorithms that we use include Hoeffding-LUCB [17, 23], KL-LUCB\n[2, 23], and lil\u2019UCB [18]. For Hoeffding-LUCB and KL-LUCB, we choose \u03b3 = 2. For lil\u2019UCB, we\nchoose \u0001 = 0.01, \u03b2 = 1, and \u03bb = ( 2+\u03b2\nInstances. The experiments are conducted on three different types of instances. To simplify notation,\nwe use r1 (cid:31) r2 (cid:31) \u00b7\u00b7\u00b7 (cid:31) rn to denote the true ranking, and let \u2206 = 0.1. (i) Type-Homo: For any\nri (cid:31) rj, pri,rj = 1/2 + \u2206. (ii) Type-MNL: The preference score of ri (i.e., \u03b8ri) is generated by\ntaking an independent instance of Uniform([0.9 \u2217 1.5n\u2212i, 1.1 \u2217 1.5n\u2212i]). By this, for any i, \u2206i is\naround 0.1. (iii) Type-Random: For any ri (cid:31) rj, pri,rj is generated by taking an independent instance\nof Uniform([0.5 + 0.8\u2206, 0.5 + 1.5\u2206]). By this, for any i, \u2206i is around 0.1. We let \u2206i\u2019s be close\nto 0.1 in order to decrease the in\ufb02uence of \u2206i\u2019s on sample complexities and show how the sample\ncomplexities of the algorithms grow with n.\nThe numerical results for these three types are presented in Figure 2 (a)-(c), respectively. For all\nsimulations, we input \u03b4 = 0.01. Every point of every \ufb01gure is averaged over 100 independent trials.\nIn every \ufb01gure, for the same n-value, the algorithms are tested on an identical input instance.\nFrom Figure 2, we can see that our algorithm signi\ufb01cantly outperforms the existing algorithms. We\ncan also see that the sample complexity of IIR scales with n log n, which is consistent with our\ntheoretical results. There are some insights about the practical performance of IIR. First, in Lines 3\nand 4 of ATC and Lines 9 and 10 of ATI, we use LUCB-like [23] designs to allow the algorithms\nreturn before completing all required iterations, which does not improve the theoretical upper bound\nbut can improve the practical performance. Second, in the theoretical analysis, we only show that\n\n\u03b2 )2 8.\n\n(cid:80)\n\n7For instance, when pri,rj = 1/2 + \u2206 for all i < j, the Borda-Score of item ri is\n\n1\nn\u22121\n\nj(cid:54)=i pri,rj =\n\n1/2 + n+1\u22122i\n\nn\u22121 \u2206, and \u2206ri = \u0398(1/n). Thus, by [15], the sample complexity of AR is at least O(n3 log n).\n\n8We do not choose the combination (\u0001 = 0, \u03b2 = 1, and \u03bb = 1+10/n) that has a better practical performance\n\nbecause this combination does not have theoretical guarantee, making the comparison in some sense unfair.\n\n8\n\n\f(a) Type-Homo.\n\n(b) Type-MNL.\n\n(c) Type-Random.\n\nFigure 2: Comparisons between IIR and existing methods.\n\nATI correctly inserts an item i with high probability when inputting \u0001 \u2264 \u2206i, but the algorithm may\nreturn before \u0001 being that small, making the practical performance better than what the theoretical\nupper bound suggests.\n\n6 Conclusion\n\nIn this paper, we investigated the theoretical limits of exact ranking with minimal assumptions. We\ndo not assume any prior knowledge of the comparison probabilities and gaps, and derived the lower\nbounds and upper bound for instances with unequal noise levels. We also derived the model-speci\ufb01c\npairwise and listwise lower bound for the MNL model, which further shows that in the worst case,\nlistwise ranking is no more ef\ufb01cient than pairwise ranking in terms of sample complexity. The\niterative-insertion-ranking (IIR) algorithm proposed in this paper indicates that our lower bounds are\noptimal under strong stochastic transitivity (SST) and some mild conditions. Numerical results also\nsuggest that our ranking algorithm outperforms existing works in the literature.\n\nAcknowledgments\n\nThis work has been supported in part by NSF grants ECCS-1818791, CCF-1758736, CNS-1758757,\nCNS-1446582, CNS-1901057; ONR grant N00014-17-1-2417; AFRL grant FA8750-18-1-0107, and\nby Institute for Information & communications Technology Promotion (IITP) grant funded by the\nKorea government (MSIT), (2017-0-00692, Transport-aware Streaming Technique Enabling Ultra\nLow-Latency AR/VR Services).\n\n9\n\n20406080100n104106108number of comparisonsIIRHoeffdingLUCB + Binary Searchlil'UCB + Binary SearchKL-LUCB + Binary SearchActive Ranking20406080100n104106108number of comparisonsIIRHoeffdingLUCB + Binary Searchlil'UCB + Binary SearchKL-LUCB + Binary SearchActive RankingPLPAC-AMPR20406080100n104106108number of comparisonsIIRHoeffdingLUCB + Binary Searchlil'UCB + Binary SearchKL-LUCB + Binary Search\fReferences\n[1] Agarwal, A., Agarwal, S., Assadi, S., and Khanna, S. (2017). Learning with limited rounds of\nadaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons. In In\nConference on Learning Theory.\n\n[2] Arratia, R. and Gordon, L. (1989). Tutorial on large deviations for the Binomial distribution.\n\nBulletin of Mathematical Biology.\n\n[3] Baltrunas, L., Makcinskas, T., and Ricci, F. (2010). Group recommendations with rank aggrega-\ntion and collaborative \ufb01ltering. In Proceedings of the fourth ACM conference on Recommender\nsystems. ACM.\n\n[4] Chen, X., Bennett, P. N., Collins-Thompson, K., and Horvitz, E. (2013). Pairwise ranking\naggregation in a crowdsourced setting. In In ACM Conference on Web Search and Data Mining.\nACM.\n\n[5] Chen, X., Li, Y., and Mao, J. (2018). A nearly instance optimal algorithm for top-k ranking under\nthe multinomial logit model. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium\non Algorithms. SIAM.\n\n[6] Chen, Y., Fan, J., Ma, C., and Wang, K. (2017). Spectral method and regularized MLE are both\n\noptimal for top-k ranking. stat.\n\n[7] Chen, Y. and Suh, C. (2015). Spectral MLE: Top-k rank aggregation from pairwise comparisons.\n\nIn International Conference on Machine Learning.\n\n[8] Conitzer, V. and Sandholm, T. (2005). Communication complexity of common voting rules. In\n\nProceedings of the 6th ACM conference on Electronic commerce. ACM.\n\n[9] Dwork, C., Kumar, R., Naor, M., and Sivakumar, D. (2001). Rank aggregation methods for the\n\nweb. In In Proceedings of the 10th international conference on World Wide Web. ACM.\n\n[10] Falahatgar, M., Hao, Y., Orlitsky, A., Pichapati, V., and Ravindrakumar, V. (2017a). Maxing\n\nand ranking with few assumptions. In In Advances in Neural Information Processing Systems.\n\n[11] Falahatgar, M., Jain, A., Orlitsky, A., Pichapati, V., and Ravindrakumar, V. (2018). The limits\nof maxing, ranking, and preference learning. In Proceedings of the 35th International Conference\non Machine Learning. PMLR.\n\n[12] Falahatgar, M., Orlitsky, A., Pichapati, V., and Suresh, A. T. (2017b). Maximum selection and\n\nranking under noisy comparisons. In International Conference on Machine Learning.\n\n[13] Farrell, R. H. (1964). Asymptotic behavior of expected sample size in certain one sided tests.\n\nThe Annals of Mathematical Statistics.\n\n[14] Feige, U., Raghavan, P., Peleg, D., and Upfal, E. (1994). Computing with noisy information.\n\nSIAM Journal on Computing.\n\n[15] Heckel, R., Shah, N. B., Ramchandran, K., and Wainwright, M. J. (2016). Active rank-\ning from pairwise comparisons and when parametric assumptions don\u2019t help. arXiv preprint\narXiv:1606.08842.\n\n[16] Heckel, R., Simchowitz, M., Ramchandran, K., and Wainwright, M. J. (2018). Approximate\nranking from pairwise comparisons. In International Conference on Arti\ufb01cial Intelligence and\nStatistics.\n\n[17] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal\n\nof the American statistical association.\n\n[18] Jamieson, K., Malloy, M., Nowak, R., and Bubeck, S. (2014). lil\u2019UCB: An optimal exploration\n\nalgorithm for multi-armed bandits. In In Conference on Learning Theory.\n\n[19] Jang, M., Kim, S., Suh, C., and Oh, S. (2017). Optimal sample complexity of m-wise data for\n\ntop-k ranking. In In Advances in Neural Information Processing Systems.\n\n10\n\n\f[20] Kalyanakrishnan, S. and Stone, P. (2010). Ef\ufb01cient selection of multiple bandit arms: Theory\n\nand practice. In International Conference on Machine Learning.\n\n[21] Kalyanakrishnan, S., Tewari, A., Auer, P., and Stone, P. (2012). PAC subset selection in\n\nstochastic multi-armed bandits. In ICML.\n\n[22] Katariya, S., Jain, L., Sengupta, N., Evans, J., and Nowak, R. (2018). Adaptive sampling for\n\ncoarse ranking. In International Conference on Arti\ufb01cial Intelligence and Statistics.\n\n[23] Kaufmann, E. and Kalyanakrishnan, S. (2013).\n\nselection. In Conference on Learning Theory.\n\nInformation complexity in bandit subset\n\n[24] Luce, R. D. (2012). Individual choice behavior: A theoretical analysis. Courier Corporation.\n\n[25] Mannor, S. and Tsitsiklis, J. N. (2004). The sample complexity of exploration in the multi-armed\n\nbandit problem. Journal of Machine Learning Research.\n\n[26] Mohajer, S. and Suh, C. (2016). Active top-k ranking from noisy comparisons. In Communica-\n\ntion, Control, and Computing (Allerton), 2016 54th Annual Allerton Conference on. IEEE.\n\n[27] Negahban, S., Oh, S., and Shah, D. (2016). Rank centrality: Ranking from pairwise comparisons.\n\nIn Operations Research.\n\n[28] Pfeiffer, T., Xi, A., Gao, A., Mao, Y., Chen, and Rand, D. G. (2012). Adaptive polling for\n\ninformation aggregation. In Twenty-sixth AAAI Conference on Arti\ufb01cial Intelligence.\n\n[29] Ren, W., Liu, J., and Shroff, N. B. (2018). PAC ranking from pairwise and listwise queries:\n\nLower bounds and upper bounds. arXiv preprint arXiv:1806.02970.\n\n[30] Ren, W., Liu, J., and Shroff, N. B. (2019). Exploring k out of top \u03c1 fraction of arms in stochastic\n\nbandits. In The 22nd International Conference on Arti\ufb01cial Intelligence and Statistics.\n\n[31] Saha, A. and Gopalan, A. (2019a). Active ranking with subset-wise preferences. In Proceedings\n\nof Machine Learning Research, Proceedings of Machine Learning Research. PMLR.\n\n[32] Saha, A. and Gopalan, A. (2019b). From PAC to instance-optimal sample complexity in the\n\nPlackett-Luce model. arXiv preprint arXiv:1903.00558.\n\n[33] Shah, N., Balakrishnan, S., Guntuboyina, A., and Wainwright, M. (2016). Stochastically\ntransitive models for pairwise comparisons: Statistical and computational issues. In International\nConference on Machine Learning.\n\n[34] Shah, N. B. and Wainwright, M. J. (2017). Simple, robust and optimal ranking from pairwise\n\ncomparisons. Journal of machine learning research.\n\n[35] Sz\u00f6r\u00e9nyi, B., Busa-Fekete, R., Paul, A., and H\u00fcllermeier, E. (2015). Online rank elicitation for\nPlackett-Luce: A dueling bandits approach. In In Advances in Neural Information Processing\nSystems.\n\n11\n\n\f", "award": [], "sourceid": 5291, "authors": [{"given_name": "Wenbo", "family_name": "Ren", "institution": "The Ohio State University"}, {"given_name": "Jia (Kevin)", "family_name": "Liu", "institution": "Iowa State University"}, {"given_name": "Ness", "family_name": "Shroff", "institution": "The Ohio State University"}]}