{"title": "Computational and Statistical Tradeoffs in Learning to Rank", "book": "Advances in Neural Information Processing Systems", "page_first": 739, "page_last": 747, "abstract": "For massive and heterogeneous modern  data sets, it is of fundamental interest to provide guarantees on the accuracy of estimation when computational resources are limited. In the application of learning to rank, we provide a hierarchy of rank-breaking mechanisms ordered by the complexity in thus generated sketch of the data. This allows the number of data points collected to be gracefully traded off against computational resources available, while guaranteeing the desired level of accuracy. Theoretical guarantees on the proposed generalized rank-breaking implicitly provide such trade-offs, which can be explicitly characterized under certain canonical scenarios on the structure of the data.", "full_text": "ComputationalandStatisticalTradeoffsinLearningtoRankAshishKhetanandSewoongOhDepartmentofISE,UniversityofIllinoisatUrbana-ChampaignEmail:{khetan2,swoh}@illinois.eduAbstractFormassiveandheterogeneousmoderndatasets,itisoffundamentalinteresttoprovideguaranteesontheaccuracyofestimationwhencomputationalresourcesarelimited.Intheapplicationoflearningtorank,weprovideahierarchyofrank-breakingmechanismsorderedbythecomplexityinthusgeneratedsketchofthedata.Thisallowsthenumberofdatapointscollectedtobegracefullytradedoffagainstcomputationalresourcesavailable,whileguaranteeingthedesiredlevelofaccuracy.Theoreticalguaranteesontheproposedgeneralizedrank-breakingimplicitlyprovidesuchtrade-offs,whichcanbeexplicitlycharacterizedundercertaincanonicalscenariosonthestructureofthedata.1IntroductionInclassicalstatisticalinference,wearetypicallyinterestedincharacterizinghowmoredatapointsimprovetheaccuracy,withlittlerestrictionsorconsiderationsoncomputationalaspectsofsolvingtheinferenceproblem.However,withmassivegrowthsoftheamountofdataavailableandalsothecomplexityandheterogeneityofthecollecteddata,computationalresources,suchastimeandmemory,aremajorbottlenecksinmanymodernapplications.Asasolution,recentadvancesin[7,23,8,1,16]introducehierarchiesofalgorithmicsolutions,orderedbytherespectivecomputationalcomplexity,forseveralfundamentalmachinelearningapplications.Guidedbysharpanalysesonthesamplecomplexity,theseapproachesprovidetheoreticallysoundguidelinesthatallowtheanalystthe\ufb02exibilitytofallbacktosimpleralgorithmstoenjoythefullmeritoftheimprovedrun-time.Inspiredbytheseadvances,westudythetime-datatradeoffinlearningtorank.Inmanyapplicationssuchaselection,policymaking,polling,andrecommendationsystems,wewanttoaggregateindi-vidualpreferencestoproduceaglobalrankingthatbestrepresentsthecollectivesocialpreference.Learningtorankisarankaggregationapproach,whichassumesthatthedatacomesfromaparametricfamilyofchoicemodels,andlearnstheparametersthatdeterminetheglobalranking.Traditionally,eachrevealedpreferenceisassumedtohaveoneofthefollowingthreestructures.Pairwisecompari-son,whereoneitemispreferredoveranother,iscommoninsportsandchessmatches.Best-out-of-\u03bacomparison,whereoneischosenamongasetof\u03baalternatives,iscommoninhistoricalpurchasedata.\u03ba-waycomparison,whereweobservealinearorderingofasetof\u03bacandidates,isusedinsomeelectionsandsurveys.Forsuchtraditionalpreferences,ef\ufb01cientschemesforlearningtorankhavebeenproposed,e.g.[12,9].However,moderndatasetsareunstructuredandheterogeneous.Thiscanleadtosigni\ufb01cantincreaseinthecomputationalcomplexity,requiringexponentialrun-timeinthesizeoftheproblemintheworstcase[15].Toalleviatethiscomputationalchallenge,weproposeahierarchyofestimatorswhichwecallgeneralizedrank-breaking,orderedinincreasingcomputationalcomplexityandachievingincreasingaccuracy.Thekeyideaistobreakdowntheheterogeneousrevealedpreferencesintosimplerpiecesofordinalrelations,andapplyanestimatortailoredforthosesimplestructurestreatingeachpieceasindependent.Severalaspectsofrank-breakingmakesthisprobleminterestingandchallenging.A30thConferenceonNeuralInformationProcessingSystems(NIPS2016),Barcelona,Spain.\fpriori,itisnotclearwhichchoicesofthesimpleordinalrelationsarerichenoughtobestatisticallyef\ufb01cientandyetleadtotractableestimators.Evenifweidentifywhichordinalrelationstoextract,theignoredcorrelationsamongthosepiecescanleadtoaninconsistentestimate,unlesswechoosecarefullywhichpiecestoincludeandwhichtoomitintheestimation.Wefurtherwantsharpanalysisonthesamplecomplexity,whichrevealshowcomputationalandstatisticalef\ufb01cienciestradeoff.Wewouldliketoaddressallthesechallengesinprovidinggeneralizedrank-breakingmethods.Problemformulation.Westudytheproblemofaggregatingordinaldatabasedonusers\u2019preferencesthatareexpressedintheformofpartiallyorderedsets(poset).Aposetisacollectionofordinalrelationsamongitems.Forexample,consideraposet{(i6\u227a{i5,i4}),(i5\u227ai3),({i3,i4}\u227a{i1,i2})}overitems{i1,...,i6},where(i6\u227a{i5,i4})indicatesthatitemi5andi4arebothpreferredoveritemi6.Sucharelationisextractedfrom,forexample,theusergivinga2-starratingtoi5andi4anda1-startoi6.Assumingthattherevealedpreferenceisconsistent,aposetcanberepresentedasadirectedacyclicgraph(DAG)Gjasbelow.i1i2i3i4i5i6i1i2i3i4i5i6i1i2i3i4i5i6Gje1e2Figure1:AnexampleofGjforuserj\u2019sconsistentposet,andtworank-breakinghyperedgesextractedfromit:e1=({i6,i5,i4,i3}\u227a{i2,i1})ande2=({i6}\u227a{i5,i4,i3}).WeassumethateachuserjispresentedwithasubsetofitemsSj,andindependentlyprovidesherordinalpreferenceintheformofaposet,wheretheorderingisdrawnfromthePlackett-Luce(PL)model.ThePLmodelisapopularchoicemodelfromoperationsresearchandpsychology,usedtomodelhowpeoplemakechoicesunderuncertainty.Itisaspecialcaseofrandomutilitymodels,whereeachitemiisparametrizedbyalatenttrueutility\u03b8i\u2208R.WhenofferedwithSj,theusersamplestheperceivedutilityUiforeachitemindependentlyaccordingtoUi=\u03b8i+Zi,whereZi\u2019sarei.i.d.noise.Inparticular,thePLmodelassumesZi\u2019sfollowthestandardGumbeldistribution.AlthoughstatisticalandcomputationaltradeoffhasbeenstudiedunderMallowsmodels[6]orstochasticallytransitivemodels[22],thetechniqueswedeveloparedifferentandhaveapotentialtogeneralizetoanalyzemoregeneralclassofrandomutilitymodels.Theobservedposetisapartialobservationoftheorderingaccordingtothisperceivedutilities.TheparticularchoiceoftheGumbeldistributionhasseveralmerits,largelystemmingfromthefactthattheGumbeldistributionhasalog-concavepdfandisinherentlymemoryless.Inouranalyses,weusethelog-concavitytoshowthatourproposedalgorithmisaconcavemaximization(Remark2.1)andthememorylesspropertyformsthebasisofourrank-breakingidea.Precisely,thePLmodelisstatisticallyequivalenttothefollowingprocedure.Considerarankingasamappingfromaranktoanitem,i.e.\u03c3j:[|Sj|]\u2192Sj.ItcanbeshownthatthePLmodelisgeneratedby\ufb01rstindependentlyassigningeachitemi\u2208SjanunobservedvalueYi,exponentiallydistributedwithmeane\u2212\u03b8i,andtheresultingranking\u03c3jisinverselyorderedinYi\u2019ssothatY\u03c3j(1)\u2264Y\u03c3j(2)\u2264\u00b7\u00b7\u00b7\u2264Y\u03c3j(|Sj|).Thisinheritsthememorylesspropertyofexponentialvariables,suchthatP(Y1<Y2<Y3)=P(Y1<{Y2,Y3})P(Y2<Y3),leadingtoasimpleinterpretationofthePLmodelassequentialchoices:P(i3\u227ai2\u227ai1)=P({i3,i2}\u227ai1)P(i3\u227ai2)=(e\u03b8i1/(e\u03b8i1+e\u03b8i2+e\u03b8i3))\u00d7(e\u03b8i2/(e\u03b8i2+e\u03b8i3)).Ingeneral,wehaveP[\u03c3j]=Q|Sj|\u22121i=1(e\u03b8\u2217\u03c3j(i))/(P|Sj|i0=ie\u03b8\u2217\u03c3j(i0)).Weassumethatthetrueutility\u03b8\u2217\u2208\u2126bwhere\u2126b={\u03b8\u2208Rd|Pi\u2208[d]\u03b8i=0,|\u03b8i|\u2264bforalli\u2208[d]}.Noticethatcenteringof\u03b8ensuresitsuniquenessasPLmodelisinvariantundershiftingof\u03b8.Theboundbon\u03b8iiswrittenexplicitlytocapturethedependenceinourmainresults.Wedenoteasetofnusersby[n]={1,...,n}andthesetofditemsby[d].LetGjdenotetheDAGrepresentationoftheposetprovidedbytheuserjoverSj\u2286[d]accordingtothePLmodelwithweights\u03b8\u2217.Themaximumlikelihoodestimate(MLE)maximizesthesumofallpossiblerankings2\fthatareconsistentwiththeobservedGjforeachj:b\u03b8\u2208argmax\u03b8\u2208\u2126b(cid:26)nXj=1log(cid:18)X\u03c3\u2208GjP\u03b8[\u03c3](cid:19)(cid:27),(1)whereweslightlyabusethenotationGjtodenotethesetofallrankings\u03c3thatareconsistentwiththeobservation.WhenGjhasatraditionalstructureasexplainedearlierinthissection,thentheoptimizationisasimplemultinomiallogitregression,thatcanbesolvedef\ufb01cientlywithoff-the-shelfconvexoptimizationtools[12].Forgeneralposets,itcanbeshownthattheaboveoptimizationisaconcavemaximization,usingsimilartechniquesasRemark2.1.However,thesummationoverrankingsinGjcaninvolvenumberoftermssuperexponentialinthesize|Sj|,intheworstcase.ThisrendersMLEintractableandimpractical.Pairwiserank-breaking.Acommonremedytothiscomputationalblow-upistouserank-breaking.Rank-breakingtraditionallyreferstopairwiserank-breaking,whereabagofallthepairwisecom-parisonsisextractedfromobservations{Gj}j\u2208[n]andisappliedtoestimatorsthataretailoredforpairwisecomparisons,treatingeachpairedoutcomeasindependent.Thisisoneofthemotivationsbehindthealgorithmicadvancesinlearningfrompairwisecomparisons[19,21,17].Itiscomputationallyef\ufb01cienttoapplymaximumlikelihoodestimatorassumingindependentpairwisecomparisons,whichtakesO(d2)operationstoevaluate.However,thiscomputationalgaincomesatthecostofstatisticalef\ufb01ciency.Itisknownfrom[4]thatifweincludeallpairedcomparisons,thentheresultingestimatecanbestatisticallyinconsistentduetotheignoredcorrelationsamongthepairedorderings,evenwithin\ufb01nitesamples.IntheexamplefromFigure1,thereare12pairedrelations:(i6\u227ai5),(i6\u227ai4),(i6\u227ai3),...,(i3\u227ai1),(i4\u227ai1).Inordertogetaconsistentestimate,[4]providesaruleforchoosingwhichpairstoinclude,and[15]providesanestimatorthatoptimizeshowtoweigheachofthosechosenpairstogetthebest\ufb01nitesamplecomplexitybound.However,suchaconsistentpairwiserank-breakingresultsinthrowingawaymanyoftheorderedrelations,resultinginsigni\ufb01cantlossinaccuracy.Forexample,noneofthepairwiseorderingscanbeusedfromGjintheexample,withoutmakingtheestimatorinconsistent[3].Whetherweincludeallpairedcomparisonsoronlyasubsetofconsistentones,thereisasigni\ufb01cantlossinaccuracyasillustratedinFigure2.Forthepreciseconditionforconsistentrank-breakingwereferto[3,4,15].Thestate-of-the-artapproachesoperateoneitheroneofthetwoextremepointsonthecomputationalandstatisticaltrade-off.TheMLEin(1)requiresO(Pj\u2208[n]|Sj|!)summationstojustevaluatetheobjectivefunction,intheworstcase.Ontheotherhand,thepairwiserank-breakingrequiresonlyO(d2)summations,butsuffersfromsigni\ufb01cantlossinthesamplecomplexity.Ideally,wewouldliketogivetheanalystthe\ufb02exibilitytochooseatargetcomputationalcomplexitysheiswillingtotolerate,andprovideanalgorithmthatachievestheoptimaltrade-offatanyoperatingpoint.Contribution.Weintroduceanovelgeneralizedrank-breakingthatbridgesthegapbetweenMLEandpairwiserank-breaking.Ourapproachallowstheuserthefreedomtochoosethelevelofcomputationalresourcestobeused,andprovidesanestimatortailoredforthedesiredcomplexity.Weprovethattheproposedestimatoristractableandconsistent,andprovideanupperboundontheerrorrateinthe\ufb01nitesampleregime.Theanalysisexplicitlycharacterizesthedependenceonthetopologyofthedata.Thisinturnprovidesaguidelinefordesigningsurveysandexperimentsinpractice,inordertomaximizethesampleef\ufb01ciency.Weprovidenumericalexperimentscon\ufb01rmingthetheoreticalguarantees.2Generalizedrank-breakingGivenGj\u2019srepresentingtheusers\u2019preferences,generalizedrank-breakingextractsasetoforderedrelationsandappliesanestimatortreatingeachorderedrelationasindependent.Concretely,foreachGj,we\ufb01rstextractamaximalorderedpartitionPjofSjthatisconsistentwithGj.Anorderedpartitionisapartitionwithalinearorderingamongthesubsets,e.g.Pj=({i6}\u227a{i5,i4,i3}\u227a{i2,i1})forGjfromFigure1.Thisismaximal,sincewecannotfurtherpartitionanyofthesubsetswithoutcreatingarti\ufb01cialorderedrelationsthatarenotpresentintheoriginalGj.TheextractedorderedpartitionisrepresentedbyadirectedhypergraphGj(Sj,Ej),whichwecallarank-breakinggraph.Eachedgee=(B(e),T(e))\u2208EjisadirectedhyperedgefromasubsetofnodesB(e)\u2286SjtoanothersubsetT(e)\u2286Sj.ThenumberofedgesinEjis|Pj|\u221213\fwhere|Pj|isthenumberofsubsetsinthepartition.ForeachsubsetinPjexceptfortheleastpreferredsubset,thereisacorrespondingedgewhosetop-setT(e)isthesubset,andthebottom-setB(e)isthesetofallitemslesspreferredthanT(e).InFigure1,forEj={e1,e2}weshowe1=(B(e1),T(e1))=({i6,i5,i4,i3},{i2,i1})ande2=(B(e2),T(e2)=({i6},{i5,i4,i3})extractedfromGj.DenotetheprobabilitythatT(e)ispreferredoverB(e)whenT(e)\u222aB(e)isofferedasP\u03b8(e)=P\u03b8(cid:0)B(e)\u227aT(e)(cid:1)=X\u03c3\u2208\u039bT(e)exp(cid:16)P|T(e)|c=1\u03b8\u03c3(c)(cid:17)Q|T(e)|u=1(cid:16)P|T(e)|c0=uexp(cid:0)\u03b8\u03c3(c0)(cid:1)+Pi\u2208B(e)exp(\u03b8i)(cid:17)(2)whichfollowsfromthede\ufb01nitionofthePLmodel,where\u039bT(e)isthesetofallrankingsoverT(e).Thecomputationalcomplexityofevaluatingthisprobabilityisdominatedbythesizeofthetop-set|T(e)|,asitinvolves(|T(e)|!)summations.WelettheanalystchoosetheorderM\u2208Z+dependingonhowmuchcomputationalresourceisavailable,andonlyincludethoseedgeswith|T(e)|\u2264Minthefollowingstep.WeapplytheMLEforcomparisonsoverpairedsubsets,assumingallrank-breakinggraphsareindependentlydrawn.Precisely,weproposeorder-Mrank-breakingestimate,whichisthesolutionthatmaximizesthelog-likelihoodundertheindependentassumption:b\u03b8\u2208argmax\u03b8\u2208\u2126bLRB(\u03b8),whereLRB(\u03b8)=Xj\u2208[n]Xe\u2208Ej:|T(e)|\u2264MlogP\u03b8(e).(3)InaspecialcasewhenM=1,thiscanbetransformedintothetraditionalpairwiserank-breaking,where(i)thisisaconcavemaximization;(ii)theestimateis(asymptotically)unbiasedandconsistent[3,4];and(iii)andthe\ufb01nitesamplecomplexityhavebeenanalyzed[15].Although,thisorder-1rank-breakingprovidesasigni\ufb01cantgainincomputationalef\ufb01ciency,theinformationcontainedinhigher-orderedgesareunused,resultinginasigni\ufb01cantlossinsampleef\ufb01ciency.Weprovidetheanalystthefreedomtochoosethecomputationalcomplexityhe/sheiswillingtotolerate.However,forgeneralM,ithasnotbeenknowniftheoptimizationin(3)istractableand/orifthesolutionisconsistent.SinceP\u03b8(B(e)\u227aT(e))asexplicitlywrittenin(2)isasumoflog-concavefunctions,itisnotclearifthesumisalsolog-concave.Duetotheignoreddependencyintheformulation(3),itisnotcleariftheresultingestimateisconsistent.We\ufb01rstestablishthatitisaconcavemaximizationinRemark2.1,thenproveconsistencyinRemark2.2,andprovideasharpanalysisoftheperformanceinthe\ufb01nitesampleregime,characterizingthetrade-offbetweencomputationandsamplesizeinSection4.WeusetheRandomUtilityModel(RUM)interpretationofthePLmodeltoproveconcavity.WerefertoAppendixAinthesupplementarymaterialforaproof.Remark2.1.LRB(\u03b8)isconcavein\u03b8\u2208Rd.Forconsistency,weconsiderasimplebutcanonicalscenarioforsamplingorderedrelations.However,westudyageneralsamplingscenario,whenweanalyzetheorder-Mestimatorinthe\ufb01nitesampleregimeinSection4.Followingisthecanonicalsamplingscenario.Thereisasetof\u02dc\u2018integers(\u02dcm1,...,\u02dcm\u02dc\u2018)whosesumisstrictlylessthand.Anewarrivinguserispresentedwithallditemsandisaskedtoprovidehertop\u02dcm1itemsasanunorderedset,andthenthenext\u02dcm2items,andsoon.ThisissamplingfromthePLmodelandobservinganorderedpartitionwith(\u02dc\u2018+1)subsetsofsizes\u02dcma\u2019s,andthelastsubsetincludesallremainingitems.Weapplythegeneralizedrank-breakingtogetrank-breakinggraphs{Gj}with\u02dc\u2018edgeseach,andorder-Mestimateiscomputed.Weshowthatthisisconsistent,i.e.asymptoticallyunbiasedinthelimitofthenumberofusersn.Aproofisprovidedinthesupplementarymaterial.Remark2.2.UnderthePLmodelandtheabovesamplingscenario,theorder-Mrank-breakingestimateb\u03b8in(3)isconsistentforallchoicesofM\u2265mina\u2208\u02dc\u2018\u02dcma.Figure2(left)illustratesthetrade-offbetweenrun-timeandsamplesizenecessarytoachievea\ufb01xedaccuracy:MSE\u22640.3d2\u00d710\u22126.Inthemiddlepanel,weshowtheaccuracy-sampletradeoffforincreasingcomputationMonthesamedata.We\ufb01xd=256,\u02dc\u2018=5,\u02dcma=afora\u2208{1,2,3,4,5},andsampleposetsfromthecanonicalscenario,exceptthateachuserispresented\u03ba=32randomitems.ThePLweightsarechoseni.i.d.U[\u22122,2].Ontherightpanel,welet\u02dcma=3foralla\u2208[\u02dc\u2018]andvary\u02dc\u2018.WecompareGRBwithM=3toPRB,andanoracleestimatorwhoknowstheexactorderingamongthosetopthreeitemsandrunsMLE.4\f 200 600 100 1000105106Time(s)samplesizenM=1M=2M=3M=4M=510-610-510-4104105106inconsistent PRBGRB order M = 1234Ckb\u03b8\u2212\u03b8\u2217k22samplesizen 0.05 4 0.1 112481620inconsistent PRBGRB order M=3oracle lower boundCR lower boundnumberofedges|Ej|Figure2:Thetime-datatrade-offfor\ufb01xedaccuracy(left)andaccuracyimprovementforincreasedcomputationM(middle).GeneralizedRank-Breaking(GRB)achievestheoraclelowerboundandsigni\ufb01cantlyimprovesuponPairwiseRank-Breaking(PRB)(right).Notations.Givenrank-breakinggraphs{Gj(Sj,Ej)}j\u2208[n]extractedfromtheposets{Gj},we\ufb01rstde\ufb01netheorderMrank-breakinggraphs{G(M)j(Sj,E(M)j)},whereE(M)jisasubsetofEjthatincludesonlythoseedgesej\u2208Ejwith|T(ej)|\u2264M.ThisrepresentsthoseedgesthatareincludedintheestimationforachoiceofM.For\ufb01nitesampleanalysis,thefollowingquantitiescapturehowtheerrordependsonthetopologyofthedatacollected.Let\u03baj\u2261|Sj|and\u2018j\u2261|E(M)j|.WeindexeachedgeejinE(M)jbya\u2208[\u2018j]andde\ufb01nemj,a\u2261|T(ej,a)|forthea-thedgeofthej-thrank-breakinggraphandrj,a\u2261|T(ej,a)|+|B(ej,a)|.Notethat,weusetildeinsubscriptwithmj,aand\u2018jwhenMisequaltoSj.Thatis\u02dc\u2018jisthenumberofedgesinEjand\u02dcmj,aisthesizeofthetop-setsinthoseedges.Weletpj\u2261Pa\u2208[\u2018j]mj,adenotetheeffectivesamplesizefortheobservationG(M)j,suchthatthetotaleffectivesamplesizeisPj\u2208[n]pj.NoticethatalthoughwedonotexplicitlywritethedependenceonM,alloftheabovequantitiesimplicitlydependonthechoiceofM.3ComparisongraphTheanalysisoftheoptimizationin(3)showsthat,withhighprobability,LRB(\u03b8)isstrictlyconcavewith\u03bb2(H(\u03b8))\u2264\u2212Cb\u03b31\u03b32\u03b33\u03bb2(L)<0forall\u03b8\u2208\u2126b(LemmaC.3),andthegradientisalsoboundedwithk\u2207LRB(\u03b8\u2217)k\u2264C0b\u03b3\u22121/22(Pjpjlogd)1/2(LemmaC.2).thequantities\u03b31,\u03b32,\u03b33,and\u03bb2(L),tobede\ufb01nedshortly,representthetopologyofthedata.ThisleadstoTheorem4.1:kb\u03b8\u2212\u03b8\u2217k2\u22642k\u2207LRB(\u03b8\u2217)k\u2212\u03bb2(H(\u03b8))\u2264C00bqPjpjlogd\u03b31\u03b33/22\u03b33\u03bb2(L),(4)whereCb,C0b,andC00bareconstantsthatonlydependonb,and\u03bb2(H(\u03b8))isthesecondlargesteigenvalueofanegativesemide\ufb01niteHessianmatrixH(\u03b8)ofLRB(\u03b8).Recallthat\u03b8>1=0sincewerestrictoursearchin\u2126b.Hence,theerrordependson\u03bb2(H(\u03b8))insteadof\u03bb1(H(\u03b8))whosecorrespondingeigenvectoristheall-onesvector.Wede\ufb01neacomparisongraphH([d],E)asaweightedundirectedgraphwithweightsAii0=Pj\u2208[n]:i,i0\u2208Sjpj/(\u03baj(\u03baj\u22121)).ThecorrespondinggraphLaplacianisde\ufb01nedas:L\u2261nXj=1pj\u03baj(\u03baj\u22121)Xi<i0\u2208Sj(ei\u2212ei0)(ei\u2212ei0)>.(5)Itisimmediatethat\u03bb1(L)=0with1astheeigenvector.Thereareremainingd\u22121eigenvaluesthatsumtoTr(L)=Pjpj.Therescaled\u03bb2(L)and\u03bbd(L)capturethedependencyonthetopology:\u03b1\u2261\u03bb2(L)(d\u22121)Tr(L),\u03b2\u2261Tr(L)\u03bbd(L)(d\u22121).(6)Inanidealcasewherethegraphiswellconnected,thenthespectralgapoftheLaplacianislarge.Thisensuresalleigenvaluesareofthesameorderand\u03b1=\u03b2=\u0398(1),resultinginasmallererror5\frate.TheconcavityofLRB(\u03b8)alsodependsonthefollowingquantities.WediscusstheroleofthetopologyinSection4.Notethatthequantitiesde\ufb01nedinthissectionimplicitlydependonthechoiceofM,whichcontrolsthenecessarycomputationalpower,viathede\ufb01nitionoftherank-breaking{Gj,a}.Wede\ufb01nethefollowingquantitiesthatcontrolourupperbound.\u03b31incorporatesasymmetryinprobabilitiesofitemsbeingrankedatdifferentpositionsdependingupontheirweight\u03b8\u2217i.Itis1forb=0thatiswhenalltheitemshavesameweight,anddecreasesexponentiallywithincreaseinb.\u03b32controlstherangeofthesizeofthetop-setwithrespecttothesizeofthebottom-setforwhichtheerrordecayswiththerateof1/(sizeofthetop-set).Thedependencein\u03b33and\u03bdareduetoweaknessintheanalysis,andensuresthattheHessianmatrixisstrictlynegativede\ufb01nite.\u03b31\u2261minj,a(cid:26)(cid:18)rj,a\u2212mj,a\u03baj(cid:19)2e2b\u22122(cid:27),\u03b32\u2261minj,a(cid:26)(cid:18)rj,a\u2212mj,arj,a(cid:19)2(cid:27),and(7)\u03b33\u22611\u2212maxj,a(cid:26)4e16b\u03b31m2j,ar2j,a\u03ba2j(rj,a\u2212mj,a)5(cid:27),\u03bd\u2261maxj,a(cid:26)mj,a\u03ba2j(rj,a\u2212mj,a)2(cid:27).(8)4MainResultsWepresentmaintheoreticalanalysesandnumericalsimulationscon\ufb01rmingthetheoreticalpredictions.4.1UpperboundontheachievableerrorWeprovideanupperboundontheerrorfortheorder-Mrank-breakingapproach,showingtheexplicitdependenceonthetopologyofthedata.Weassumeeachuserprovidesapartialrankingaccordingtohis/herorderedpartitions.Precisely,weassumethatthesetofofferingsSj,thenumberofsubsets(\u02dc\u2018j+1),andtheirrespectivesizes(\u02dcmj,1,...,\u02dcmj,\u02dc\u2018j)arepredetermined.EachuserrandomlydrawsarankingofitemsfromthePLmodel,andprovidesthepartialrankingoftheform({i6}\u227a{i5,i4,i3}\u227a{i2,i1})intheexampleinFigure1.ForachoiceofM,theorder-Mrank-breakinggraphisextractedfromthisdata.Thefollowingtheoremprovidesanupperboundontheachievederror,andaproofisprovidedinthesupplementarymaterial.Theorem4.1.Supposetherearenusers,ditemsparametrizedby\u03b8\u2217\u2208\u2126b,andeachuserj\u2208[n]ispresentedwithasetofofferingsSj\u2286[d]andprovidesapartialorderingunderthePLmodel.ForachoiceofM\u2208Z+,if\u03b33>0andtheeffectivesamplesizePnj=1pjislargeenoughsuchthatnXj=1pj\u2265214e20b\u03bd2(\u03b1\u03b31\u03b32\u03b33)2\u03b2pmax\u03bamindlogd,(9)whereb\u2261maxi|\u03b8\u2217i|isthedynamicrange,pmax=maxj\u2208[n]pj,\u03bamin=minj\u2208[n]\u03baj,\u03b1isthe(rescaled)spectralgap,\u03b2isthe(rescaled)spectralradiusin(6),and\u03b31,\u03b32,\u03b33,and\u03bdarede\ufb01nedin(7)and(8),thenthegeneralizedrank-breakingestimatorin(3)achieves1\u221adkb\u03b8\u2212\u03b8\u2217k\u226440e7b\u03b1\u03b31\u03b33/22\u03b33sdlogdPnj=1P\u2018ja=1mj,a,(10)withprobabilityatleast1\u22123e3d\u22123.Moreover,forM\u22643theaboveboundholdswith\u03b33replacedbyone,givingatighterresult.NotethatthedependenceonthechoiceofMisnotexplicitinthebound,butratherisimplicitintheconstructionofthecomparisongraphandthenumberofeffectivesamplesN=PjPa\u2208[\u2018j]mj,a.Inanidealcase,b=O(1)andmj,a=O(r1/2j,a)forall(j,a)suchthat\u03b31,\u03b32are\ufb01nite.further,ifthespectralgapislargesuchthat\u03b1>0and\u03b2>0,thenEquation(10)impliesthatweneedtheeffectivesamplesizetoscaleasO(dlogd),whichisonlyalogarithmicfactorlargerthanthenumberofparameters.Inthisidealcase,thereexistuniversalconstantsC1,C2suchthatifmj,a<C1\u221arj,aandrj,a>C2\u03bajforall{j,a},thenthecondition\u03b33>0ismet.Further,whenrj,a=O(\u03baj,a),max\u03baj,a/\u03baj0,a0=O(1),andmaxpj,a/pj0,a0=O(1),thenconditionontheeffectivesamplesizeismetwithPjpj=O(dlogd).Webelievethatdependencein\u03b33isweaknessofouranalysisandthereisnodependenceaslongasmj,a<rj,a.6\f4.2LowerboundoncomputationallyunboundedestimatorsRecallthat\u02dc\u2018j\u2261|Ej|,\u02dcmj,a=|T(ea)|and\u02dcrj,a=|T(ea)\u222aB(ea)|whenM=Sj.Weproveafundamentallowerboundontheachievableerrorratethatholdsforanyunbiasedestimatorevenwithnorestrictionsonthecomputationalcomplexity.Foreach(j,a),de\ufb01ne\u03b7j,aas\u03b7j,a=\u02dcmj,a\u22121Xu=0(cid:16)1\u02dcrj,a\u2212u+u(\u02dcmj,a\u2212u)\u02dcmj,a(\u02dcrj,a\u2212u)2(cid:17)+Xu<u0\u2208[\u02dcmj,a\u22121]2u\u02dcmj,a(\u02dcrj,a\u2212u)\u02dcmj,a\u2212u0\u02dcrj,a\u2212u0(11)=\u02dcm2j,a/(3\u02dcrj,a)+O(\u02dcm3j,a/\u02dcr2j,a).(12)Theorem4.2.LetUdenotethesetofallunbiasedestimatorsof\u03b8\u2217thatarecenteredsuchthatb\u03b81=0,andlet\u00b5=maxj\u2208[n],a\u2208[\u02dc\u2018j]{\u02dcmj,a\u2212\u03b7j,a}.Forallb>0,infb\u03b8\u2208Usup\u03b8\u2217\u2208\u2126bE[kb\u03b8\u2212\u03b8\u2217k2]\u2265max\uf8f1\uf8f2\uf8f3(d\u22121)2Pnj=1P\u02dc\u2018ja=1(\u02dcmj,a\u2212\u03b7j,a),1\u00b5dXi=21\u03bbi(L)\uf8fc\uf8fd\uf8fe.(13)TheproofreliesontheCramer-Raoboundandisprovidedinthesupplementarymaterial.Since\u03b7j,a\u2019sarenon-negative,themeansquarederrorislowerboundedby(d\u22121)2/N,whereN=PjPa\u2208\u02dc\u2018j\u02dcmj,aistheeffectivesamplesize.Comparingittotheupperboundin(10),thisistightuptoalogarithmicfactorwhen(a)thetopologyofthedataiswell-behavedsuchthatallrespectivequantitiesare\ufb01nite;and(b)thereisnolimitonthecomputationalpowerandMcanbemadeaslargeasweneed.TheboundinEq.(13)furthergivesatighterlowerbound,capturingthedependencyin\u03b7j,a\u2019sand\u03bbi(L)\u2019s.Consideringthe\ufb01rstterm,\u03b7j,aislargerwhen\u02dcmj,aiscloseto\u02dcrj,a,givingatighterbound.Thesecondtermin(13)implieswegetatighterboundwhen\u03bb2(L)issmaller. 112345inconsistent PRBGRB order M=moracle lower boundCR lower boundCkb\u03b8\u2212\u03b8\u2217k22sizeoftop-setm 112345inconsistent PRBGRB order M=moracle lower boundCR lower boundsizeoftop-setm110100568163264 10b=210.50.2CR lower boundset-size\u03baFigure3:Accuracydegradesas(\u03ba\u2212m)getssmallandasthedynamicrangebgetslarge.InFigure3leftandmiddlepanel,wecompareperformanceofouralgorithmwithpairwisebreaking,CramerRaolowerboundandoracleMLElowerbound.We\ufb01xd=512,n=105,\u03b8\u2217choseni.i.d.uniformlyover[\u22122,2].OracleMLEknowsrelativeorderingofitemsinallthetop-setsT(e)andhenceisstrictlybetterthantheGRB.We\ufb01x\u02dc\u2018=\u2018=1thatisr=\u03ba,andvarym.Intheleftpanel,we\ufb01x\u03ba=32andinthemiddlepanel,we\ufb01x\u03ba=16.Perhapssurprisingly,GRBmatcheswiththeoracleMLEwhichmeansrelativeorderingoftop-mitemsamongthemselvesisstatisticallyinsigni\ufb01cantwhenmissuf\ufb01cientlysmallincomparisonto\u03ba.For\u03ba=16,asmgetslarge,theerrorstartstoincreaseaspredictedbyouranalysis.Thereasonisthatthequantities\u03b31and\u03b32getssmallerasmincreases,andtheupperboundincreasesconsequently.Intherightpanel,we\ufb01xm=4.When\u03baissmall,\u03b32issmall,andhenceerrorislarge;whenbislarge\u03b31isexponentiallysmall,andhenceerrorissigni\ufb01cantlylarge.ThisisdifferentfromlearningMallowsmodels,wherepeakeddistributionsareeasiertolearn[2],andisrelatedtothefactthatwearenotonlyinterestedinrecoveringthe(ordinal)rankingbutalsothe(cardinal)weight.4.3ComputationalandstatisticaltradeoffForestimatorswithlimitedcomputationalpower,however,theabovelowerboundfailstocapturethedependencyontheallowedcomputationalpower.Understandingsuchfundamentaltrade-offsisachallengingproblem,whichhasbeenstudiedonlyinafewspecialcases,e.g.plantedcliqueproblem7\f[10,18].Thisisoutsidethescopeofthispaper,andweinsteadinvestigatethetrade-offachievedbytheproposedrank-breakingapproach.Whenwearelimitedoncomputationalpower,Theorem4.1implicitlycapturesthisdependencewhenorder-Mrank-breakingisused.Thedependenceiscapturedindirectlyviatheresultingrank-breaking{Gj,a}j\u2208[n],a\u2208[\u2018j]andthetopologyofit.Wemakethistrade-offexplicitbyconsideringasimplebutcanonicalexample.Suppose\u03b8\u2217\u2208\u2126bwithb=O(1).Eachusergivesani.i.d.partialranking,whereallitemsareofferedandthepartialrankingisbasedonanorderedpartitionwith\u02dc\u2018j=b\u221a2cd1/4csubsets.Thetopsubsethassize\u02dcmj,1=1,andthea-thsubsethassize\u02dcmj,a=a,uptoa<\u02dc\u2018j,inordertoensurethattheysumatmosttoc\u221adforsuf\ufb01cientlysmallpositiveconstantcandtheconditionon\u03b33>0issatis\ufb01ed.Thelastsubsetincludesalltheremainingitemsinthebottom,ensuring\u02dcmj,\u02dc\u2018j\u2265d/2and\u03b31,\u03b32and\u03bdareall\ufb01nite.Computation.ForachoiceofMsuchthatM\u2264\u2018j\u22121,weconsiderthecomputationalcomplexityinevaluatingthegradientofLRB,whichscalesasTM=Pj\u2208[n]Pa\u2208[M](mj,a!)rj,a=O(M!\u00d7dn).Notethatwe\ufb01ndtheMLEbysolvingaconvexoptimizationproblemusing\ufb01rstordermethods,anddetailedanalysisoftheconvergencerateandthecomplexityofsolvinggeneralconvexoptimizationsisoutsidethescopeofthispaper.Sample.Underthecanonicalsetting,forM\u2264\u2018j\u22121,wehaveL=M(M+1)/(2d(d\u22121))(cid:0)I\u221211>(cid:1).Thiscompletegraphhasthelargestpossiblespectralgap,andhence\u03b1>0and\u03b2>0.SincetheeffectivesamplessizeisPj,a\u02dcmj,aI{\u02dcmj,a\u2264M}=nM(M+1)/2,itfollowsfromTheorem4.1thatthe(rescaled)rootmeansquarederrorisO(p(dlogd)/(nM2)).Inordertoachieveatargeterrorrateof\u03b5,weneedtochooseM=\u2126((1/\u03b5)p(dlogd)/n).Theresultingtrade-offbetweenrun-timeandsampletoachieverootmeansquarederror\u03b5isT(n)\u221d(d(1/\u03b5)p(dlogd)/ne)!dn.WeshownumericalexperimentunderthiscanonicalsettinginFigure2(left)withd=256andM\u2208{1,2,3,4,5},illustratingthetrade-offinpractice.4.4Real-worlddatasetsOnsushipreferences[14]andjesterdataset[11],weimproveoverpairwisebreakingandachievessameperformanceastheoracleMLE.Fullrankingsover\u03ba=10typesofsushiarerandomlychosenfromd=100typesofsushiareprovidedbyn=5000individuals.Asthegroundtruth\u03b8\u2217,weusetheMLestimateofPLweightsovertheentiredata.InFigure4,leftpanel,foreachm\u2208{3,4,5,6,7},weremovetheknownorderingamongthetop-mandbottom-(10\u2212m)sushiineachset,andrunourestimatorwithonebreakingedgebetweentop-mandbottom-(10\u2212m)items.Wecompareouralgorithmwithinconsistentpairwisebreaking(usingoptimalchoiceofparametersfrom[15])andtheoracleMLE.Form\u22646,theproposedrank-breakingperformsaswellasanoraclewhoknowsthehiddenrankingamongthetopmitems.Jesterdatasetconsistsofcontinuousratingsbetween\u221210to+10of100jokesonsetsofsize\u03ba,36\u2264\u03ba\u2264100,by24,983users.Weconvertratingsintofullrankings.Thegroundtruth\u03b8\u2217iscomputedsimilarly.Form\u2208{2,3,4,5},weconverteachfullrankingintoaposetthathas\u2018=b\u03ba/mcpartitionsofsizem,byremovingknownrelativeorderingfromeachpartition.Figure4comparesthethreealgorithmsusingallsamples(middlepanel),andbyvaryingthesamplesize(rightpanel)for\ufb01xedm=4.All\ufb01guresareaveragedover50instances. 1 10 3 4 5 6 7inconsistent PRBGRB order M = moracle lower boundCkb\u03b8\u2212\u03b8\u2217k22sizeoftop-setm 0.01 0.1 1 2 3 4 5inconsistent PRBGRB order M = moracle lower boundsizeoftop-setsm 0.001 0.01 0.1 1000 10000inconsistent PRBGRB order M = 4oracle lower boundsamplesizenFigure4:Generalizedrank-breakingimprovesoverpairwiseRBandisclosetooracleMLE.AcknowledgementsThisworkissupportedbyNSFSaTCawardCNS-1527754,andNSFCISEawardCCF-1553452.8\fReferences[1]A.Agarwal,P.L.Bartlett,andJ.C.Duchi.Oracleinequalitiesforcomputationallyadaptivemodelselection.arXivpreprintarXiv:1208.0129,2012.[2]A.AliandM.Meil\u02d8a.Experimentswithkemenyranking:Whatworkswhen?MathematicalSocialSciences,64(1):28\u201340,2012.[3]H.AzariSou\ufb01ani,W.Chen,D.CParkes,andL.Xia.Generalizedmethod-of-momentsforrankaggregation.InAdvancesinNeuralInformationProcessingSystems26,pages2706\u20132714,2013.[4]H.AzariSou\ufb01ani,D.Parkes,andL.Xia.Computingparametricrankingmodelsviarank-breaking.InProceedingsofThe31stInternationalConferenceonMachineLearning,pages360\u2013368,2014.[5]H.AzariSou\ufb01ani,D.C.Parkes,andL.Xia.Randomutilitytheoryforsocialchoice.InNIPS,pages126\u2013134,2012.[6]N.Betzler,R.Bredereck,andR.Niedermeier.Theoreticalandempiricalevaluationofdatareductionforexactkemenyrankaggregation.AutonomousAgentsandMulti-AgentSystems,28(5):721\u2013748,2014.[7]O.BousquetandL.Bottou.Thetradeoffsoflargescalelearning.InAdvancesinneuralinformationprocessingsystems,pages161\u2013168,2008.[8]V.ChandrasekaranandM.I.Jordan.Computationalandstatisticaltradeoffsviaconvexrelaxation.ProceedingsoftheNationalAcademyofSciences,110(13):E1181\u2013E1190,2013.[9]Y.ChenandC.Suh.Spectralmle:Top-krankaggregationfrompairwisecomparisons.arXiv:1504.07218,2015.[10]Y.DeshpandeandA.Montanari.Improvedsum-of-squareslowerboundsforhiddencliqueandhiddensubmatrixproblems.arXivpreprintarXiv:1502.06590,2015.[11]K.Goldberg,T.Roeder,D.Gupta,andC.Perkins.Eigentaste:Aconstanttimecollaborative\ufb01lteringalgorithm.InformationRetrieval,4(2):133\u2013151,2001.[12]B.Hajek,S.Oh,andJ.Xu.Minimax-optimalinferencefrompartialrankings.InAdvancesinNeuralInformationProcessingSystems27,pages1475\u20131483,2014.[13]T.P.Hayes.Alarge-deviationinequalityforvector-valuedmartingales.Combinatorics,ProbabilityandComputing,2005.[14]T.Kamishima.Nantonaccollaborative\ufb01ltering:recommendationbasedonorderresponses.InProceedingsoftheninthACMSIGKDDinternationalconferenceonKnowledgediscoveryanddatamining,pages583\u2013588.ACM,2003.[15]A.KhetanandS.Oh.Data-drivenrankbreakingforef\ufb01cientrankaggregation.InInternationalConferenceonMachineLearning,2016.[16]M.Lucic,M.I.Ohannessian,A.Karbasi,andA.Krause.Tradeoffsforspace,time,dataandriskinunsupervisedlearning.InAISTATS,2015.[17]L.MaystreandM.Grossglauser.Fastandaccurateinferenceofplackett-lucemodels.InAdvancesinNeuralInformationProcessingSystems28(NIPS2015),2015.[18]R.Meka,A.Potechin,andA.Wigderson.Sum-of-squareslowerboundsforplantedclique.InProceedingsoftheForty-SeventhAnnualACMonSymposiumonTheoryofComputing,pages87\u201396.ACM,2015.[19]S.Negahban,S.Oh,andD.Shah.Rankcentrality:Rankingfrompair-wisecomparisons.preprintarXiv:1209.1688,2014.[20]A.Pr\u00e9kopa.Logarithmicconcavemeasuresandrelatedtopics.InStochasticprogramming,1980.[21]N.B.Shah,S.Balakrishnan,J.Bradley,A.Parekh,K.Ramchandran,andM.J.Wainwright.Estimationfrompairwisecomparisons:Sharpminimaxboundswithtopologydependence.arXiv:1505.01462,2015.[22]N.B.Shah,S.Balakrishnan,A.Guntuboyina,andM.J.Wainright.Stochasticallytransitivemodelsforpairwisecomparisons:Statisticalandcomputationalissues.arXivpreprintarXiv:1510.05610,2015.[23]S.Shalev-ShwartzandN.Srebro.Svmoptimization:inversedependenceontrainingsetsize.InProceedingsofthe25thinternationalconferenceonMachinelearning,pages928\u2013935.ACM,2008.9\f", "award": [], "sourceid": 436, "authors": [{"given_name": "Ashish", "family_name": "Khetan", "institution": "University of Illinois Urbana-"}, {"given_name": "Sewoong", "family_name": "Oh", "institution": "UIUC"}]}