{"title": "Efficient Pure Exploration in Adaptive Round model", "book": "Advances in Neural Information Processing Systems", "page_first": 6609, "page_last": 6618, "abstract": "In the adaptive setting, many multi-armed bandit applications allow the learner to adaptively draw samples and adjust sampling strategy in rounds. In many real applications, not only the query complexity but also the round complexity need to be optimized. In this paper, we study both PAC and exact top-$k$ arm identification problems and design efficient algorithms considering both round complexity and query complexity. For PAC problem, we achieve optimal query complexity and use only $O(\\log_{\\frac{k}{\\delta}}^*(n))$ rounds, which matches the lower bound of round complexity, while most of existing works need $\\Theta(\\log \\frac{n}{k})$ rounds. For exact top-$k$ arm identification, we improve the round complexity factor from $\\log n$ to $\\log_{\\frac{1}{\\delta}}^*(n)$, and achieve near optimal query complexity. In experiments, our algorithms conduct far fewer rounds, and outperform state of the art by orders of magnitude with respect to query cost.", "full_text": "Ef\ufb01cient Pure Exploration in Adaptive Round model\n\nTianyuan Jin\u2020, Jieming Shi\u2021, Xiaokui Xiao\u2021, Enhong Chen\u2020\u2217\n\n\u2020School of Computer Science and Technology, University of Science and Technology of China\n\n\u2021School of Computing, National University of Singapore\n\n\u2020 jty123@mail.ustc.edu.cn, cheneh@ustc.edu.cn, \u2021{shijm, xkxiao}@nus.edu.sg\n\nAbstract\n\nIn the adaptive setting, many multi-armed bandit applications allow the learner\nto adaptively draw samples and adjust sampling strategy in rounds.\nIn many\nreal applications, not only the query complexity but also the round complexity\nneed to be optimized. In this paper, we study both PAC and exact top-k arm\nidenti\ufb01cation problems and design ef\ufb01cient algorithms considering both round\ncomplexity and query complexity. For PAC problem, we achieve optimal query\n\ncomplexity and use only O(log\u2217\nof round complexity, while most of existing works need \u0398(log n\n\n\u03b4(n)) rounds, which matches the lower bound\nk) rounds. For\n\nk\n\n1\n\n\u03b4(n), and achieve near optimal query complexity.\n\nexact top-k arm identi\ufb01cation, we improve the round complexity factor from log n\nto log\u2217\nIn experiments, our\nalgorithms conduct far fewer rounds, and outperform state of the art by orders of\nmagnitude with respect to query cost.\n\n1\n\nIntroduction\n\nMutli-armed bandit (MAB) problems are classic decision problems with numerous applications such\nas medical trials [1], online advertisement [2], and crowdsourcing [3]. These problems typically\nconsider a bandit with a set of arms, each of which has an unknown reward distribution with an\nunknown mean, and the objective is either to (i) identify the top-k arms with the maximum reward\nmeans or (ii) maximize the expected total reward under some constrains on the costs of arm pulling.\n\nThis paper studies the problem of top-k arms identi\ufb01cation in the adaptive setting, which allows the\nleaner to draw samples from the arms adaptively in rounds to estimate their means, and to adjust the\n\nsampling strategy for the i-th round based on the observations from the \ufb01rst i\u2212 1 rounds. Following\n\nprevious work [4], we assume that in each round, the learner is allowed to query an arbitrary number\nof arms for an arbitrary number of times, but the query results would only be revealed at the end of\nthe round. We aim to minimize the number of rounds performed, as well as achieving best possible\nquery complexity. In addition, our proposed algorithms exhibit superior practical performance due to\nour small constant factors. Existing top-k algorithms mainly focus on query complexity and most of\nthem are not ef\ufb01cient due to their large constants [5, 6, 3, 7, 8] or inferior query complexities [9, 10].\nAdaptive round setting of MAB has many real applications, as described below.\n\nMedical trials. In medical trials [11], to identify the best drug for a disease, one can conduct tests in\nrounds, such that each round involves testing multiple candidate drugs on multiple clinical subjects\n(e.g., mice) simultaneously. However, after each round of testing, there is typically a waiting time\n(e.g., days) before the effects of drugs become observable to guide the design of the next round of\ntesting. It is important to minimize not only the total number of tests on clinical subjects (i.e., query\ncomplexity) but also the number of rounds, to identify the best drug within the shortest time frame.\n\n\u2217Corresponding author\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fOnline advertisement. In online advertisement [12], an advertiser may push ads to the users of\ncandidate websites, so as to identify the top-k websites that have the highest click-through rates and\nmatch some clients advertising requirements. The pushing of ads could be conducted in rounds, and\neach round may involve multiple ads and multiple users. However, in each round, it takes time to\nobserve users\u2019 responses to the ads, and to decide which websites are unpromising and should be\npruned in the next round. In this application, there is usually a tight time frame to offer a solution to\nthe clients, so as to ensure the timeliness of the ads.\n\nCrowdsourcing. Workers on crowdsourcing platforms often vary signi\ufb01cantly in terms of the answer\nquality. As an effective strategy to identify the most reliable workers for a speci\ufb01c task, one may test\neach worker with a sequence of questions with ground-truths, and then select workers based on the\naccuracy of their answers. Note that for such tests, workers need some time to answer the questions,\nand need to be rewarded upon the completion of the questions. To minimize the time and monetary\ncots, it is crucial to have an algorithm to identify the most reliable workers that minimizes the number\nof tests (i.e., query complexities) within a limited number of rounds, where our proposals \ufb01t.\n\n1.1 Problem Formulation\n\nUnder the standard setting of stochastic multi-armed bandit selection, there is a set S of n arms,\n\nsuch that each arm i is associated with an unknown reward distributionDi supported on[0, 1] with\n\nunknown mean \u03b8i. Let i\u2217 be the arm with ith largest mean. We aim to identify the k arms with the\nlargest means by pulling (i.e., sampling from) the arms in rounds. In each round, we can pull any\nnumber of arms for any number of times, such that (i) each pull of an arm i returns a reward that is an\n\ni.i.d. sample fromDi, and (ii) the reward is only revealed at the end of the round.\n\nFor PAC subset selection, we study two problems: (i) Problem 1 (PAC-top-k): PAC Top-k Arm\nSelection with Adaptive Rounds, and (ii) Problem 2 (RL-top-k): Top-k Arm with a Round Limit R.\n\nIn both problems, the goal is to identify a set V \u2286 S of k arms, such that for all i \u2208 [1, k], the ith\nlargest arm in V has mean larger than \u03b8i\u2217 \u2212 \u01eb with probability at least 1\u2212 \u03b4, where \u01eb and \u03b4 are given\n\nconstants. Speci\ufb01cally, for PAC-top-k, we aim to minimize the number of rounds performed, while\nachieving the best possible query complexity; for RL-top-k, we expose a upper limit on the number\nof rounds that can perform, R, and aim to minimize the query complexity within R rounds.\n\nFor exact top-k arm identi\ufb01cation, denoted as Problem 3 (exact-top-k), we aim to minimize the\nnumber of rounds required as well as the query cost, for identifying the top-k arms with the largest\n\nmeans. We assume \u03b8k\u2217 > \u03b8(k+1)\u2217 , in order to ensure the uniqueness of the solution.\n\n1.2 State of the Art\n\nTo the best of our knowledge, Agarwal et al.\u2019s work [4] is the only one that studies the top-k arms\nproblem while taking into account the round complexity. In particular, [4] studies the identi\ufb01cation\nof exact top-k arms with adaptive rounds, and presents a method that takes \u2206k as input and returns\n\n\u03b4\u0002 and round\nthe exact top-k arms with at least 1\u2212 \u03b4 probability, with query complexity O\u0002 n\ncomplexity2 log\u2217(n) , where \u2206k denotes the difference between the means of the kth and(k + 1)th\nlargest arms, and log\u2217(n) denotes the iterated logarithm of n, i.e.,\nif n > 1\n\nlog\u2217(n)=\u00041+ log\u2217(log n),\n\notherwise\n\n\u22c5 log k\n\n\u22062\nk\n\n(1)\n\n0,\n\n\u22c5(log k\n\nthe logarithm function on n for R times, i.e.,\n\nn before the result is no more than 1. Furthermore, [4] also studies the problem where the round limit\nR is given. Their algorithm identi\ufb01es the exact top-k arms with at least 1 \u2212 \u03b4 probability, with a query\n\nIn other words, log\u2217(n) equals the number of times that we need to apply the logarithm function on\n+ ilog(R)(n))\u0002, where ilog(R)(n) is the result of iteratively applying\ncomplexity of O\u0002 n\nilog(r)(x)=\u0004ilog(r\u22121)(log(x)),\nif x> 1\nif x\u2264 1.\nWith respect to lower bound, Agarwal et al. show that a round complexity of log\u2217(n) is near\noptimal, since for constants k and \u03b4, any algorithm with O( n\nk) query complexity requires at least\n\n\u22062\nk\n\n(2)\n\n\u22062\n\n1,\n\n\u03b4\n\n2All logarithms(e.g., log\u2217\n\nb(n)) in this paper are to base b.\n\n2\n\n\fk = 1\nAll k \u2208 [n]\n\nAlgorithm\n\n[5]\n\n[6, 16, 14]\n\nThis paper (Algorithm 1)\n\nNumber of Rounds Query Complexity\n\n\u0398(log n)\n\u0398(log n\nk)\n\u03b4 (n)\n2 log\u2217\n\nk\n\nO\u0001 n\nO\u0001 n\nO\u0001 n\n\n\u03b4\u0001\n\u01eb2 \u22c5 log 1\n\u03b4\u0001\n\u01eb2 \u22c5 log k\n\u03b4\u0001\n\u01eb2 \u22c5 log k\n\nTable 1: Summary of algorithms for Problem 1: Top-k arms with adaptive rounds.\n\nAlgorithm\n\nBound\n\nQuery Complexity\n\nAll k \u2208 [n]\n\n[4], assuming \u2206k is known\n\nThis paper (Algorithm 2)\n\nexact top-k O\u0003 n\nO\u0003 n\n\n(\u01eb, \u03b4)\n\n\u22c5(log k\n\u01eb2 \u22c5(log k\n\n\u22062\nk\n\n\u03b4\n\n\u03b4\n\n+ ilog(R)(n))\u0003\n\u03b4 (n))\u0003\n+ ilog(R)\n\nk\n\nTable 2: Summary of algorithms for Problem 2: Top-k arms with a round limit R\n\nlog\u2217(n)\u2212 log\u2217(\u0398(log\u2217(n))) rounds. Besides, Agarwal et al. prove that identifying the exact top-k\narms with at least 3~4 probability using R rounds must use \u2126\u0002 n\n\nkR4 \u22c5 ilog(r)\u0001 n\n\nk\u0001\u0002 samples.\n\n\u22062\n\n.\n\nAgarwal et al.\u2019s algorithm suffers from a major de\ufb01ciency that it requires \u2206k to be known in advance,\nwhich is unrealistic in most practical applications as the mean of each arm is unknown. In addition,\nthe algorithm cannot be extended to address PAC-top-k and RL-top-k by replacing \u2206k with \u01eb, since\nthe algorithm strongly relies on the assumption that there is exact k arms whose means are larger\nthan \u03b8k\u2217 \u2212 \u2206k, where k\u2217 is the arm with the kth largest mean. (This assumption does not hold in\n\ngeneral if we replace \u2206k with any \u01eb > \u2206k.) Further, the algorithm cannot be used to get instance-\non{\u03b8i}n\n\ndependent query complexity (where the query complexity not only depends on \u2206k but also depends\ni=1), since all Exponential-Gap-Elimination algorithms [8, 13, 14, 15] need a PAC algorithm\n\nas a subroutine.\n\nThere also exists a number of techniques [16, 6, 13, 10, 14, 8] for both PAC and exact top-k arm\nidenti\ufb01cation problems that optimizes the query complexity, without considering the round complexity.\nThe query complexity achieved by these technique is near optimal. However, all of these incur log n\nfactor on round complexity, signi\ufb01cantly worse than the round complexity of [4].\n\n1.3 Our Results\n\nIn this paper, we present three algorithms for the top-k arm selection problems in adaptive round\nmodel. Below summarizes our results.\nTheorem 1. There is an algorithm that computes \u01eb-top-k arms with probability at least 1 \u2212 \u03b4, pulls\n\n\u01eb2 \u22c5 log k\n\u01eb2 \u22c5(ilog(R)\n\nTheorem 2. There is an algorithm that computes \u01eb-top-k arms with probability at least 1 \u2212 \u03b4, pulls\n\nthe arms at most O( n\nthe arms at most O( n\nspecial cases of PAC-top-k and RL-top-k with \u01eb\u2190 \u2206k, the round complexity of our algorithm for\n\n\u03b4) times and runs in at most 2 log\u2217\n\u03b4 (n) + log k\n\nSince (i) the solution in [4] is proved to be near-optimal, and (ii) the problems studied in [4] are\n\n\u03b4) times and runs within R rounds.\n\n\u03b4(n) expected rounds.\n\nPAC-top-k and the query complexity of our algorithm for RL-top-k are near-optimal.\n\nk\n\nk\n\nCompared with the solution in [4], our algorithms do not require any prior knowledge of \u2206k, and\n\nallow us to choose an error parameter \u01eb \u2208 (0, 1) to strike a trade-off between the accuracy and\n\nef\ufb01ciency of the algorithm, which is much more practical. Further, our PAC version can be used to\nget instance-dependent query complexity while [4] can not.\nTheorem 3. There is an algorithm that computes exact top-k arms with probability at least 1 \u2212 \u03b4,\n\npulls the arms at most O\u0002\u2211n\n\nCompared with the previous exact top-k arm algorithms [14, 8, 13], we improve the factor on round\ncomplexity from log n to log\u2217\nsummarize our results and those of the state-of-the-art methods.\n\nk ) rounds.\n\u03b4(n), while achieving the same query complexity. Tables 1, 2 and 3\n\n\u0002\u0002 times and runs in O(log\u2217\n\nlog\u0002 k\u22c5log \u2206\u22121\n\nn \u22c5 log \u2206\u22121\n\ni=1 \u2206\u22122\n\n1\n\u03b4\n\n\u03b4\n\ni\n\ni\n\n1\n\n2 PAC Subset Selection\n\nWe present our algorithms for the PAC top-k arms selection problems, i.e., PAC-top-k and RL-top-k.\n\n3\n\n\fk = 1\n\nAll k \u2208 [n]\n\nAlgorithm\n\n[8]\n\n[17]\n\n[10]\n\n[13]\n\nThis paper\n\nO\u0003\u2211n\nO\u0003\u2211n\n\nRound Complexity\n\ni\n\ni\n\n\u03b4\n\nlog \u2206\u22121\n\nk )\nO(log n \u22c5 log \u2206\u22121\ni=1 \u2206\u22122\n\n\u22c5 log\ni=1 \u2206\u22121\n\u22c5 log \u2211n\ni=1 \u2206\u22122\nk )\nO(log n \u22c5 log \u2206\u22121\nk )\nO(log\u2217\nn \u22c5 log \u2206\u22121\n\n1\n\u03b4\n\n\u03b4\n\ni\n\ni\n\nO\u0003\u2211n\nO\u0003\u2211n\n\u0003\n\u0003 O\u0003\u2211n\nO\u0003\u2211n\nO\u0003\u2211n\n\n\u0003\n\u0003\n\u0003\n\u0003\n\u0003\n\nQuery Complexity\ni=1 \u2206\u22122\n\n\u22c5 log\n\ni\n\n\u03b4\n\nlog \u2206\u22121\n\ni\n\ni=1 \u2206\u22122\n\ni\n\ni=1 \u2206\u22122\n\ni\n\nlog \u2206\u22121\n\ni\n\n\u03b4\n\ni=1 \u2206\u22121\n\ni\n\n\u22c5 log\n\u22c5 log \u2211n\n\ni=1 \u2206\u22122\n\ni\n\ni=1 \u2206\u22122\n\ni\n\n\u22c5 log\n\n\u22c5 log\n\nk\u22c5log \u2206\u22121\n\nk\u22c5log \u2206\u22121\n\n\u03b4\n\n\u03b4\n\n\u03b4\n\ni\n\ni\n\nTable 3: Summary of algorithms for Problem 3: Exact top-k arm identi\ufb01cation. (For i \u2264 k, \u2206i\ndenotes the difference between the means of the ith and (k+ 1)th arms. For i > k, \u2206i denotes the\n\ndifference between the means of the kth and ith arms.)\n\n2.1 Top-k \u03b4-Elimination\n\nk\n\nk\n\n100(1\u2212 \u03b4\n\nk-\u03b4E (Algorithm 1) can identify the top-k arms for PAC-top-k, with query complexity O( n\n\nand at most 2 log\u2217\ne.g., [16, 6], which only eliminate half of the candidates in each round, k-\u03b4E can eliminate at least\n\n\u03b4)\n\u01eb2 log k\n\u03b4(n) expected rounds. Compared with Median Elimination based top-k algorithms,\nk) percent of candidate arms every other round, which is far better. We go through the\n\n\u03b4(n). Without ambiguity, r means iterations in Algorithm 1, but means rounds in Algorithm 2.\n\nalgorithm \ufb01rst and then explain why. Note that in each while iteration (Line 5-17), k-\u03b4E performs\ntwo separate rounds of pulling (Line 5 and Line 8), since the pulls at Line 8 are dependent on\nthe empirical results obtained at Line 5. This corresponds to the 2 factor in our round complexity\n2 log\u2217\nAlgorithm 1 takes as input S, Q, k, \u01eb, \u03b4, where S is the set of all the arms and Q = c\n\u01eb2 (c is an constant\nfactor determined in Lemma 1). An empty set S\u2032 (Line 3) is initialized for the storage of the arms and\ntheir empirical means obtained later in the algorithm. In each iteration, we pull every arm in Sr by\nQr times, and sort them by their empirical means (Line 5). At Line 7-8, we double test the empirical\nk\u2032 (in order to keep the estimation unbiased) and keep it in S\u2032. Then we update\nmean of each arm in Sr\nSr to Sr+1 by only keeping the arms with empirical means 3~4\u01eb greater than the kth largest mean in\n\nS\u2032, and also excluding the arms in Sr\nk\u2032 (Line 10). From Line 11 to 15, we update \u03b2r and \u03b4r, which\ncan make Qr exponentially decrease in next iteration. This is critical to keep the total number of pulls\nlinear to n. The whole process continues until Sr is empty, then the top-k arms in S\u2032 are returned.\n\nto sample O((1~\u01eb2\n\nMedian Elimination (ME) methods can only allow \u01ebr regret in each iteration(\u2211r \u01ebr \u2264 \u01eb), in order to\n\nguarantee \u01eb error bound even when the best arm is mistakenly eliminated. On the other hand, k-\u03b4E\nallows \u01eb loss in each iteration with the help of S\u2032 and double test, which allows us to perform fewer\npulls and eliminate more than half of the arms per iteration. During a iteration r, ME methods need\n\nr) log(k~\u03b4r)) times per arm, much larger compared to O((1~\u01eb2) log(k~\u03b4r)). It is\n\neven worse when r increases (i.e., \u01ebr decreases), leading to the large constant factors in ME methods.\nSpeci\ufb01cally, in Algorithm 1, S\u2032 stores randomly chosen arms that are eliminated. It holds that the\ntop ith(i \u2264 k) arm stored in S\u2032 is at most \u01eb smaller than ith eliminated arm (see the proof details of\nLemma 1). If k-\u03b4E has eliminated the ith largest arm, then with high probability the ith largest arm\nstored in S\u2032 must be the \u01eb-approximate of ith largest arm. Hence, k-\u03b4E allows \u01eb loss per iteration.\n\nMoreover, k-\u03b4E uses a more aggressive indicator to eliminate arms, compared to the median indicator\nused in Median Elimination based algorithms. We use as our indicator, the kth largest empirical mean\n\nof the randomly chosen top arms stored in S\u2032, plus 3~4\u01eb (Line 10 of Algorithm 1). However, directly\n\nusing such indicator without double test, the indicator may be positively biased. And then all the\n\u01eb-top-k arms might be eliminated with such indicator, which leads to wrong results. To deal with this,\nwe use double-test strategy to re-sample another Qr times at Line 8 before using the indicator at Line\n\n10 in Algorithm 1, to keep the indicator unbiased. Further, 3~4\u01eb increment is added to the indicator to\n\neliminate more arms safely, proved in Lemma 1. k-\u03b4E runs in 2 log\u2217\n\nk\n\n\u03b4(n) expected rounds.\n\nCompared to [4], our Algorithm 1 and Algorithm 2 are fundamentally different. We assume no prior\nknowledge of the arms, e.g., \u2206k. Given \u2206k, Agarwal et al.\u2019s algorithm can compute an optimal\nindicator to eliminate the arms de\ufb01nitely not in top-k. Our indicator (Line 10) is set with the help of\nS\u2032 and double test, which gives our algorithm near-optimal round complexity 2 log\u2217\n\nk\n\n\u03b4(n).\n\n4\n\n\fk\u2032 , double test by re-sampling it Qr times and insert its new empirical mean into S\u2032;\n\n\u2212 1] sorted arms as set Sr\nk\u2032 ;\n\n\u03b4r) times; sort them decreasingly by empirical means \u02c6\u03b8i;\n\nAlgorithm 1 Top-k \u03b4-Elimination (k-\u03b4E)\n1: Input: S, Q, k, \u03b5 and \u03b4.\n\n2: Initialize r \u2190 1, \u03b21 \u2190 1, \u03b41 \u2190 \u03b4~4, S1 \u2190 S.\n3: Initialize S\u2032 \u2190\u2205.\n4: while Sr \u2260\u2205 do\n\n5:\n\n6:\n7:\n8:\n9:\n10:\n11:\n\nSample each arm i \u2208 Sr for Qr \u2190 \u03b2r \u22c5 Q \u22c5 log( k\nk\u2032 \u2190 min{k,Sr};\nUniformly sample k\u2032 arms from the top-[\u2308(\u03b4r~k)\u03b2r \u22c5Sr~2\u2309 + k\u2032\nFor each arm i \u2208 Sr\nGet the k-th largest mean in S\u2032 as S\u2032(k);\nSet Sr+1 \u2190 {i \u2208 Sr \u2236 \u02c6\u03b8i \u2265 S\u2032(k) + 3\u01eb~4} and Sr+1 \u2190 Sr+1\u0192Sr\nif Sr+1\u2264 2\u03b4\n\u03b2r+1 \u2190 \u03b2r\n\u03b2r+1 \u2190 \u03b2r\nSr \nSr+1 ;\n\u03b4r+1 \u2190 \u03b4~(2 \u22c5 2r);\nr \u2190 r + 1;\n\n12:\n13:\n14:\n15:\n16:\n17:\n18: end while\n19: Return: Top-k arms in S\u2032.\n\nk Sr then\nSr \n2Sr+1 ;\n\nend if\n\nelse\n\nk\u2032 ;\n\n2.2 Bounding the Regret, Query Complexity, and Round Number of k-\u03b4E\n\n2\n\n(\u01eb\u2212\u01eb1)2 ,\n\nWe bound the regret in k-\u03b4E and give its query and round complexity. The proofs are in Appendix B.\n\nAlgorithm 1 still works with the (\u01eb, \u03b4) guarantee.\n\nLemma 1. Given a n-arm set, S, parameter \u01eb\u2208(0, 1), and \u03b4 \u2208(0, 1~4), it suf\ufb01ces to run Algorithm 1\nwith Q \u2265 32\n\u01eb2 in order to obtain a k-sized subset V \u2286 S, such that with probability at least 1\u2212 \u03b4,\nthe ith largest arm in V has mean larger than \u03b8i\u2217 \u2212 \u01eb, for all i \u2208 [1, k]. Additionally, if we change\nthe parameter 3~4\u01eb (Line 10 in Algorithm 1) to \u01eb1, where \u01eb1 \u2208 (0, \u01eb), then by setting Q \u2265\nLemma 1 provides the (\u01eb, \u03b4) guarantee of algorithm k-\u03b4E. Lemma 2 shows that, w.h.p., Sr+1 is\n(\u03b4~k)\u2212\u03b2r times smaller than Sr, which is used in Lemma 3 to bound the round complexity.\n\u01eb2 and \u03b4 \u2208(0, 1~4), then at iteration r, with probability at least 1\u2212 2\u03b4r,Sr+1\u2264\nLemma 2. If Q\u2265 57\n\u23082 \u22c5(\u03b4r~k)\u03b2rSr\u2309 \u2212 1.\nLemma 3. For Q\u2265 57~\u01eb2 and \u03b4 \u2208 (0, 1~4), with probability at least 1 \u2212 \u03b4, the number of rounds R\u2032\nused in k-\u03b4E satis\ufb01es: R\u2032 \u2264 2 log\u2217\n8] present a lower bound of \u2126( n\nLemma 4. Let N be the number of arms pulled by Algorithm 1. For Q \u2265 57\nleast 1 \u2212 \u03b4, N \u2264 7n \u22c5 Q \u22c5 log(4k~\u03b4); and E[N]\u2264 7(n + 1) \u22c5 Q \u22c5 log(4k~\u03b4).\n\nHence, up to a small constant factor, our query complexity is optimal. Combining Lemma 1,3, and 4,\nTheorem 1 follows.\n\n\u03b4) for PAC version (the Explore-k metric, see Section 5).\n\nNext, we provide the query complexity of k-\u03b4E in Lemma 4. Kalyanakrishnan et al. [10, Theorem\n\n\u03b4 (n), and E[R\u2032]\u2264 2(1 + 2\u03b4) log\u2217\n\n\u01eb2 , with probability at\n\nRemark 1. In previous work [16], as the theoretical analysis is rather pessimistic due to the extensive\nusage of the union bound, the constant to achieve regret bound are far from tight. The constant can\n\nIn k-\u03b4E, (i) our constants (Lemma 4) are much smaller; (ii) our constant factor is adjustable according\n\nbe even up to 105(i.e., the bound is N \u2265 105\nto Lemma 1 with the \u01eb regret bound still guaranteed (for instance, the 3~4\u01eb factor in Line 10 of\nAlgorithm 1 can be changed to 1~2\u01eb, then setting Q\u2265 8\n\n\u01eb2 still guarantees the PAC bound); (iii) our\nalgorithm can stop as soon as it is con\ufb01dent to \ufb01nd the correct arm, reducing the practical query cost.\n\n\u03b4 in previous works).\n\n\u03b4(n).\n\n\u01eb2 log 1\n\n\u01eb2 log k\n\nk\n\nk\n\n2.3 Top-k Arm Selection with a Round Limit\n\nIn this section, we propose k-\u03b4ER (Algorithm 2) to solve Problem 2, top-k arm selection with a\nround limit R. Our proposal can report correct result within R rounds, with almost optimal query\ncomplexity. Compared to k-\u03b4E, k-\u03b4ER only requires one round (Line 4) per iteration, rather than\n\n5\n\n\fAlgorithm 2 Top-k \u03b4-Elimination with Limited Rounds (k-\u03b4ER)\n1: Input: S, R, k,Q, \u01eb and \u03b4.\n\n2: Initialize r \u2190 1, \u03b41 \u2190 \u03b4~4, \u03b21 \u2190 1+ i log(R)\n3: for r \u2264 R\u2212 1 do\n\nk\n\n\u03b4 (n), S1 \u2190 S, S\u2032 \u2190\u2205.\n\n4:\n5:\n6:\n7:\n8:\n9:\n10:\n\nSample each arm in Sr by Qr \u2190 \u03b2r \u22c5 Q \u22c5 log(k~\u03b4r) times, and sort decreasingly by their empirical \u02c6\u03b8i;\nk\u2032 \u2190 min{k,Sr}.\nUniformly sample k\u2032 arms from the top-[\u2308(\u03b4r~k)\u03b2r \u22c5Sr~2\u2309 + k\u2032\nk\u2032 into S\u2032;\nAdd Sr\nLet Sr+1 be set containing all the top-[\u23082 \u22c5(\u03b4r~k)\u03b2rSr\u2309 + k\u2032\nSr+1 \u2190 Sr+1\u0192Sr\nk\u2032 ;\n\u03b2r+1 \u2190 \u03b2r\nSr \n2Sr+1 ;\n\u03b4r+1 \u2190 \u03b4~(2 \u22c5 2r);\nr \u2190 r + 1;\n\n\u2212 1] sorted arms as set Sr\n\n\u2212 1] sorted arms in Sr;\n\nk\u2032 ;\n\n11:\n12:\n13: end for\n\n14: Return: US(S\u2032, SR, Q, \u03b2R, \u03b4, k).\n\nAlgorithm 3 Uniformly Sampling (US)\n1: Input: S\u2032, SR, Q, \u03b2R, \u03b4, k.\n\n3: Let SR\n\n2: Sample each arm i \u2208 SR by Q \u22c5 \u03b2R \u22c5 log 2k\u22c52R\n4: Sample each arm i \u2208 S\u2032 by Q log 4S\u2032\n\nk be the set of all the top-min{k,SR} arms;\n\n5: Return: Top-k arms in SR\n\n\u03b4\n\n\u03b4\n\nk \u22c3 S\u2032.\n\ntimes and sort decreasingly by their empirical \u02c6\u03b8i;\n\ntimes, and let \u02c6\u03b8i be its empirical mean;\n\ntwo rounds per iteration. According to Lemma 2, in each iteration, we can bound the total number of\narms in Sr+1 without double test. Thus, rather than performing the double test immediately (Line 8,\nAlgorithm 1), k-\u03b4ER delays all the double-tests of all the iterations until the \ufb01nal round (Line 14,\nAlgorithm 2), and conducts all the double-tests in this round, using uniform sampling (Algorithm 3).\n\nAlgorithm 2 shows the pseudo-code of k-\u03b4ER. It takes as input one more parameter, R, the round\n\nlimit. S\u2032 stores all the arms delayed for double test in the \ufb01rst R \u2212 1 rounds (Line 7). At Line\n\n14, Algorithm 3 is called to sample all the arms in both S\u2032 and SR in one round, and then the the\ntop-k arms are reported. Note that in Algorithm 3, the samples in Line 2 and 4 can be submitted\nsimultaneously, so this only cost one round. Compared to Median Elimination algorithms, k-\u03b4ER has\nsimilar advantages as k-\u03b4E analyzed in Section 3.1. With the help of S\u2032 and double test, k-\u03b4ER can\n\neliminate more arms in each round, while still provides(\u01eb, \u03b4) guarantee, as follows.\n\u03b4(n), it\nLemma 5. Given a n-arm set, S, parameters k, \u01eb \u2208 (0, 1), \u03b4 \u2208 (0, 1~4), and 1 \u2264 R \u2264 log\u2217\n\u01eb2 in order to obtain a k-sized subset V \u2286 S, such that with\nsuf\ufb01ces to run Algorithm 2 with Q \u2265 57\nprobability at least 1\u2212 \u03b4, the ith largest arm in V has mean larger than \u03b8i\u2217 \u2212 \u01eb, for all i\u2208[1, k].\nR \u2265 log\u2217\nLemma 6. If Q \u2265 57\nO\u0003 n\n\n\u03b4(n), our algorithm can achieve the optimal query complexity using just log\u2217\n\n\u03b4(n) rounds.\n\u03b4(n), Algorithm 2 uses\n\n\u01eb2 , with target number of rounds 1 \u2264 R \u2264 log\u2217\n\nDetails of the proof are in Appendix B. Lemma 6 bounds the query complexity of k-\u03b4ER. When\n\n\u03b4 (n)+ log(k~\u03b4)\u0003\u0003 samples.\n\nCombining Lemma 5 and 6, Theorem 2 follows.\n\n\u01eb2 \u0003ilog(R)\n\nk\n\nk\n\nk\n\nk\n\nk\n\n3 Exact Top-k Arm Identi\ufb01cation\n\nHere we solve exact-top-k to identify exact top-k arms. Our algorithm uses the Exponential-Gap-\nElimination algorithm(e.g., [13, 14, 8]) as a framework, and uses Algorithm 1 as a component.\nSpeci\ufb01cally, we replace the Median Elimination Algorithm used in [13, 14, 8] by Algorithm 1 and\nthen prove the newly algorithms satis\ufb01es Theorem 3. Here, we use [13] as an example. In [13], it\nhas three subroutines, called PAC-Best-k, EstMean-Large, EstMean-Small. We replace all of these\nsubroutines by Algorithm 1, to get our algorithm for exact-top-k, denoted as Algorithm 4. In [13],\n\n6\n\n\f12\n\n10\n\n10\n\n10\n\n8\n\n10\n\nt\ns\no\nC\n\nE\n\nME\n\nER\n\n12\n\n10\n\n10\n\n10\n\n8\n\n10\n\nt\ns\no\nC\n\nE\n\nME\n\nER\n\n12\n\n10\n\n10\n\n10\n\n8\n\n10\n\nt\ns\no\nC\n\nE\n\nME\n\nER\n\n6\n\n10\n\n0.01\n\n0.03\n\n0.05\n\n0.07\n\n0.09\n\n6\n\n10\n\n0.01\n\n0.03\n\n0.05\n\n0.07\n\n0.09\n\n6\n\n10\n\n0.01\n\n0.03\n\n0.05\n\n0.07\n\n0.09\n\n(a) Uniform Dataset.\n\n(b) Normal Dataset.\n\n(c) Segment Dataset.\n\nFigure 1: Query cost of PAC best arm selection.\n\n12\n\n10\n\n10\n\n10\n\nt\ns\no\nC\n\n8\n\n10\n\nk- E\n\nME-AS\n\nk- ER\n\n12\n\n10\n\n10\n\n10\n\nt\ns\no\nC\n\n8\n\n10\n\nk- E\n\nME-AS\n\nk- ER\n\n12\n\n10\n\n10\n\n10\n\nt\ns\no\nC\n\n8\n\n10\n\nk- E\n\nME-AS\n\nk- ER\n\n6\n\n10\n\n0.01\n\n0.03\n\n0.05\n\n0.07\n\n0.09\n\n6\n\n10\n\n0.01\n\n0.03\n\n0.05\n\n0.07\n\n0.09\n\n6\n\n10\n\n0.01\n\n0.03\n\n0.05\n\n0.07\n\n0.09\n\n(a) Uniform Dataset.\n\n(b) Normal Dataset.\n\n(c) Segment Dataset.\n\nFigure 2: Query cost of PAC top-k arm selection.\n\nit is proved that the query complexity is no worse than O\u0002\u2211n\n\nLemma 7 and 8, we get Theorem 3. The proofs are in Appendix B.\n\ni=1 \u2206\u22122\n\ni\n\n\u22c5 log k\u22c5log \u2206\u22121\n\ni\n\n\u03b4\n\n\u0002. Combining\n\nLemma 7 ([13], Theorem 1.2). Algorithm 4 returns the correct answer with probability at least 1 \u2212 \u03b4\n\nand takes O\u0002\u2211n\n\u22c5 log k\u22c5log \u2206\u22121\nLemma 8. Algorithm 4 runs in O(log\u2217\n\n\u0002 samples.\n\u03b4(n) \u22c5 log \u2206\u22121\n\ni=1 \u2206\u22122\n\n\u03b4\n\ni\n\ni\n\nk\n\nk ) rounds.\n\n4 Experiments\n\n4.1 Experimental Results for PAC Top-k Identi\ufb01cation\n\nFor PAC top-k arms, we compare k-\u03b4E and k-\u03b4ER with median elimination method ME-AS [16].\n\narm algorithm ME [5]. We do not experimentally compare to [4] since there is no prior knowledge\nof \u2206k in this paper. Note that ME-AS is designed for relative error. To make a fair comparison,\n\nWhen k = 1, we denote k-\u03b4E and k-\u03b4ER, as \u03b4E and \u03b4ER respectively, and compare them with the best\ngiven the absolute error bound \u01eb, we transform it to \u01eb~\u03b81, where \u03b81 is the largest mean in the given\nbandit. \u01eb~\u03b81 is used as the equivalent relative error bound in ME-AS. As proved in Lemma 1, without\ncompromising correctness, we can adjust the elimination indicator in k-\u03b4E (Line 10 in Algorithm 1).\nWe change 3~4\u01eb to 1~2\u01eb and set Q to be 8\n\u01eb2 in our implementation, to gain even better performance.\nWithout loss of generality, we test our algorithms and competitors on arms following independent\nBernoulli distributions with various means. We set the number of total arms to be n = 2000. We test\n\nthe methods on three synthetic datasets, as follows:\n\n\u2022 Uniform: \u03b8i \u223c Unif[0, 1]. The mean of arms, \u03b8i, are uniformly distributed in[0, 1].\n\u2022 Normal: \u03b8i \u223c T N(0.5, 0.2). Each \u03b8i is generated from a truncated normal distribution with\nmean 0.5, the standard deviation 0.2 and the support[0, 1].\n\u2022 Segment: \u03b8i = 0.5 for i = 1, \u22ef, k and \u03b8i = 0.4 for i = k + 1, \u22ef, n.\n\nDefault parameter values are set as: \u03b4 = 0.1, and R = 2. For each setting, the results are averaged over\n\n100 repeated runs. As shown later, ME-AS can be very costly and takes too long time to obtain their\naverage performance over 100 runs, so we terminate them when time is up and report the average\nobtained. We vary \u01eb from 0.01 to 0.1, while keeping other parameters unchanged.\n\n7\n\n\fAlgorithm Uniform Normal\n\nSegment\n\nk = 1\n\nk = 20\n\nME\n\u03b4E\n\u03b4ER\n\nME-AS\n\nk-\u03b4E\nk-\u03b4ER\n\n11\n2.2\n2\n6\n2.1\n2\n\n11\n3.4\n2\n6\n3.0\n2\n\n11\n3.9\n2\n6\n3.8\n2\n\nTable 4: Number of rounds performed.\n\nDataset\n\nAlgorithm Rounds\n\nQuery Cost\n\n21\n36\n\nNormal\n\nUniform\n\nSegment\n\nEG-\u03b4E\n\n[8]\n[17]\n\nEG-\u03b4E\n\n[8]\n[17]\n\nEG-\u03b4E\n\n[8]\n[17]\n\n1.4\u00d7 108\n6.7\u00d7 109\n0.9\u00d7 108\n2.8\u00d7 109\n1.2\u00d7 1011\n2.4\u00d7 109\n5.6\u00d7 107\n1.3\u00d7 1010\n2.2\u00d7 108\n\n27\n59\n\n0.9\u00d7 108\n2.4\u00d7 109\n2.2\u00d7 108\n\n6\n24\n\nTable 5: Exact top-k arms: rounds and query cost.\n\nFor the best arm selection (k = 1), Figure 1 reports the query cost (i.e., total number of pulls) for\n\n\u03b4E, \u03b4ER, and ME on the three datasets. Both \u03b4E and \u03b4ER outperform ME signi\ufb01cantly for all the \u01eb\nvalues on the three datasets. \u03b4E is about 100 times faster than ME, while \u03b4ER is about 10 times faster\n\nthan ME. \u03b4E is faster than \u03b4ER since \u03b4ER has hard round limit R = 2, while \u03b4E does not has such\n\nconstraint. The \ufb01rst row of Table 4 shows the number of rounds actually used by each method. \u03b4ER\nstrictly uses only two rounds limited by R, and \u03b4E needs slightly more rounds, while ME requires 11\n\nrounds that is several times more. For the top-k arm selection (k = 20), Figure 2 reports the query\n\ncost for k-\u03b4E, k-\u03b4ER, and ME-AS when varying \u01eb. k-\u03b4ER is about 100 times faster than ME-AS,\nand k-\u03b4E is about 1000 times faster. The second row of Table 4 shows the number of rounds used\nper method. Our methods use far fewer rounds. In summary, our k-\u03b4E, k-\u03b4ER outperform ME and\nME-AS with a huge performance gap.\n\n4.2 Experimental Results for Exact Top-k Arm Identi\ufb01cation\n\nWe evaluate our algorithm for exact top-k arm identi\ufb01cation. Our algorithm choose [8] as framework,\nsince [14, 13] only focus on theory part and have big constants. We call our algorithm EG-\u03b4E\n(Exponential-Gap + \u03b4E), and compare it with Elimination based [8] and UCB based [17] algorithms.\n\nDefault parameter is set as: \u03b4 = 0.1. We set [17]\u2019s parameters following their experimental setting.\n\nOther experimental settings are same as the PAC-top-k problems.\n\nTable 5 reports the query and round cost for different methods. Compare with [8], EG-\u03b4E uses fewer\nrounds and is up to 250 times faster than [8] with respect to query cost. Compare with [17], EG-\u03b4E\nuses signi\ufb01cantly fewer rounds while keeps the query cost on same order.\n\n5 Related Work\n\nInstance-independent arm selection. Top-k arm selection is \ufb01rst studied under the setting of\ninstance-independent. All such existing works [5, 18, 6, 16, 3, 14] are designed for worst case query\n\ncomplexity, and need \u0398(log n\nk) rounds, which is inferior to ours. Median Elimination [5] \ufb01nds the\nbest arm (when k = 1) with query complexity O( n\n\u03b4) under PAC bound, matching the lower\nbound in [18]. Our top-k algorithms can be easily handle best arm selection by simply setting k = 1.\nith-top arm has mean greater than \u03b8i\u2217 \u2212 \u01eb, for all i \u2208[1, k], where \u03b8i\u2217 is ith largest mean in the whole\ngreater than \u03b8k\u2217 \u2212 \u01eb, where \u03b8k\u2217 is kth largest mean in the whole bandit. Our metric is tighter than\n\nbandit. Explore-k metric is studied in [6]: with high probability, all the k selected arms have mean\n\nWe use the same top-k arm de\ufb01nition as [16], which requires that, with high probability, the selected\n\n\u01eb2 log 1\n\n8\n\n\fExplore-k metric, and thus our algorithms can also apply to solve Explore-k problem. Another metric\nwas considered in [3], where the identi\ufb01ed k arms can have at most k\u01eb regret in total. [14] studies\nmulti-armed bandit problem under matroid constraints. All these works are elimination based.\n\nInstance-dependent arm selection. The query complexity of instance-optimal algorithms (e.g.,\n[4, 7, 13, 15, 8, 10, 9]) is closely tied to the bandit instance and is better than the worst case\ncomplexity for \u2018easy\u2019 bandit instances. Some of them [7, 13, 15, 8] are elimination-based, and use\ninstance-independent algorithms like [6] and [5] as a sub-procedure to eliminate the arms. Due to\nthe usage of instance-independent algorithms, in the worst case, each iteration of these instance-\n\ndependent algorithms needs log n rounds. Thus the total round complexity is O(log n \u22c5 log \u2206\u22121\nk ).\n\nAnother instance-dependent approach is based on upper or lower con\ufb01dence bounds (UCB or LUCB),\ne.g., [10] [9]. With respect to query complexity, UCB methods require a log n factor, while it is log k\nin the lower bound. For round complexity, UCB methods need a huge number of rounds since their\nround complexity is proportional to the query complexity due to their nature of fully adaptiveness.\n\nVariant settings on limited rounds. Under the delayed feedback setting, the reward of pulling an\narm in round \u03c4 is delayed to be shown in later round \u03c4 + t [19, 20]. Our methods can simulate this\nsetting when taking an appropriately high value of t. Most of the existing works focus on regret\nminimization rather than top-k arms. Some works [11, 21, 22] investigate the batches arm problem.\n[11] only considers the regret minimization. [21] only allows to pull an arm once per round; the\n\nnumber of rounds required is \u2126(log n). In [22], within a round, there are limits for both the number\nof total pulls and the number of pulls per arm. Its rounds in the worst case is at least \u2126(log n).\n\n6 Conclusion\n\nWe study the problems of top-k arm selection in adaptive round model, and propose algorithms\nthat achieve the near-optimal query complexity and match the lower bound of round complexity. In\npractice, our algorithms outperform existing methods in terms of query cost and round complexity.\n\nAcknowledgement\n\nThis research was supported by National Natural Science Foundation of China (No.U1605251) and\nby the National University of Singapore under SUG grant R-252-000-686-133.\n\nReferences\n\n[1] William R Thompson. On the likelihood that one unknown probability exceeds another in view\n\nof the evidence of two samples. Biometrika, 25(3/4):285\u2013294, 1933.\n\n[2] Dimitris Bertsimas and Adam J Mersereau. A learning approach for interactive marketing to a\n\ncustomer segment. Operations Research, 55(6):1120\u20131135, 2007.\n\n[3] Yuan Zhou, Xi Chen, and Jian Li. Optimal pac multiple arm identi\ufb01cation with applications to\n\ncrowdsourcing. In International Conference on Machine Learning, pages 217\u2013225, 2014.\n\n[4] Arpit Agarwal, Shivani Agarwal, Sepehr Assadi, and Sanjeev Khanna. Learning with limited\nrounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons.\nIn Conference on Learning Theory, pages 39\u201375, 2017.\n\n[5] Eyal Even-Dar, Shie Mannor, and Yishay Mansour. Pac bounds for multi-armed bandit and\nmarkov decision processes. In International Conference on Computational Learning Theory,\npages 255\u2013270. Springer, 2002.\n\n[6] Shivaram Kalyanakrishnan and Peter Stone. Ef\ufb01cient selection of multiple bandit arms: theory\n\nand practice. In ICML, pages 511\u2013518, 2010.\n\n[7] Jiecao Chen, Xi Chen, Qin Zhang, and Yuan Zhou. Adaptive multiple-arm identi\ufb01cation. In\n\nInternational Conference on Machine Learning, pages 722\u2013730, 2017.\n\n[8] Zohar Karnin, Tomer Koren, and Oren Somekh. Almost optimal exploration in multi-armed\n\nbandits. In International Conference on Machine Learning, pages 1238\u20131246, 2013.\n\n9\n\n\f[9] Shouyuan Chen, Tian Lin, Irwin King, Michael R Lyu, and Wei Chen. Combinatorial pure\nexploration of multi-armed bandits. In Advances in Neural Information Processing Systems,\npages 379\u2013387, 2014.\n\n[10] Shivaram Kalyanakrishnan, Ambuj Tewari, Peter Auer, and Peter Stone. Pac subset selection in\n\nstochastic multi-armed bandits. In ICML, volume 12, pages 655\u2013662, 2012.\n\n[11] Vianney Perchet, Philippe Rigollet, Sylvain Chassang, Erik Snowberg, et al. Batched bandit\n\nproblems. The Annals of Statistics, 44(2):660\u2013681, 2016.\n\n[12] Eric M Schwartz, Eric T Bradlow, and Peter S Fader. Customer acquisition via display\nadvertising using multi-armed bandit experiments. Marketing Science, 36(4):500\u2013522, 2017.\n\n[13] Lijie Chen, Jian Li, and Mingda Qiao. Nearly instance optimal sample complexity bounds for\n\ntop-k arm selection. In Arti\ufb01cial Intelligence and Statistics, pages 101\u2013110, 2017.\n\n[14] Lijie Chen, Anupam Gupta, and Jian Li. Pure exploration of multi-armed bandit under matroid\n\nconstraints. In Conference on Learning Theory, pages 647\u2013669, 2016.\n\n[15] Lijie Chen, Jian Li, and Mingda Qiao. Towards instance optimal bounds for best arm identi\ufb01ca-\n\ntion. In Conference on Learning Theory, pages 535\u2013592, 2017.\n\n[16] Wei Cao, Jian Li, Yufei Tao, and Zhize Li. On top-k selection in multi-armed bandits and hidden\nbipartite graphs. In Advances in Neural Information Processing Systems, pages 1036\u20131044,\n2015.\n\n[17] Kevin Jamieson, Matthew Malloy, Robert Nowak, and S\u00e9bastien Bubeck. lil\u2019ucb: An optimal\nexploration algorithm for multi-armed bandits. In Conference on Learning Theory, pages\n423\u2013439, 2014.\n\n[18] Shie Mannor and John N Tsitsiklis. The sample complexity of exploration in the multi-armed\n\nbandit problem. Journal of Machine Learning Research, 5(Jun):623\u2013648, 2004.\n\n[19] Pooria Joulani, Andras Gyorgy, and Csaba Szepesv\u00e1ri. Online learning under delayed feedback.\n\nIn International Conference on Machine Learning, pages 1453\u20131461, 2013.\n\n[20] Thomas Desautels, Andreas Krause, and Joel W Burdick. Parallelizing exploration-exploitation\ntradeoffs in gaussian process bandit optimization. The Journal of Machine Learning Research,\n15(1):3873\u20133923, 2014.\n\n[21] Yifan Wu, Andras Gyorgy, and Csaba Szepesvari. On identifying good options under combi-\nnatorially structured feedback in \ufb01nite noisy environments. In International Conference on\nMachine Learning, pages 1283\u20131291, 2015.\n\n[22] Kwang-Sung Jun, Kevin Jamieson, Robert Nowak, and Xiaojin Zhu. Top arm identi\ufb01cation\nin multi-armed bandits with batch arm pulls. In Arti\ufb01cial Intelligence and Statistics, pages\n139\u2013148, 2016.\n\n[23] Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of\n\nthe American statistical association, 58(301):13\u201330, 1963.\n\n[24] Lijie Chen and Jian Li. On the optimal sample complexity for best arm identi\ufb01cation. CoRR,\n\nabs/1511.03774, 2015.\n\n10\n\n\f", "award": [], "sourceid": 3579, "authors": [{"given_name": "Tianyuan", "family_name": "Jin", "institution": "University of Science and Technology of China"}, {"given_name": "Jieming", "family_name": "SHI", "institution": "NATIONAL UNIVERSITY OF SINGAPORE"}, {"given_name": "Xiaokui", "family_name": "Xiao", "institution": "National University of Singapore"}, {"given_name": "Enhong", "family_name": "Chen", "institution": "University of Science and Technology of China"}]}