{"title": "Learning Optimal Reserve Price against Non-myopic Bidders", "book": "Advances in Neural Information Processing Systems", "page_first": 2038, "page_last": 2048, "abstract": "We consider the problem of learning optimal reserve price in repeated auctions against non-myopic bidders, who may bid strategically in order to gain in future rounds even if the single-round auctions are truthful. Previous algorithms, e.g., empirical pricing, do not provide non-trivial regret rounds in this setting in general. We introduce algorithms that obtain small regret against non-myopic bidders either when the market is large, i.e., no bidder appears in a constant fraction of the rounds, or when the bidders are impatient, i.e., they discount future utility by some factor mildly bounded away from one. Our approach carefully controls what information is revealed to each bidder, and builds on techniques from differentially private online learning as well as the recent line of works on jointly differentially private algorithms.", "full_text": "Learning Optimal Reserve Price against Non-myopic\n\nBidders\n\nZhiyi Huang\u2217\n\nJinyan Liu\n\nXiangning Wang\n\nDepartment of Computer Science\n\nThe University of Hong Kong\n\n{zhiyi, jyliu, xnwang}@cs.hku.hk\n\nAbstract\n\nWe consider the problem of learning optimal reserve price in repeated auctions\nagainst non-myopic bidders, who may bid strategically in order to gain in future\nrounds even if the single-round auctions are truthful. Previous algorithms, e.g.,\nempirical pricing, do not provide non-trivial regret rounds in this setting in general.\nWe introduce algorithms that obtain a small regret against non-myopic bidders\neither when the market is large, i.e., no single bidder appears in more than a\nsmall constant fraction of the rounds, or when the bidders are impatient, i.e.,\nthey discount future utility by some factor mildly bounded away from one. Our\napproach carefully controls what information is revealed to each bidder, and builds\non techniques from differentially private online learning as well as the recent line\nof works on jointly differentially private algorithms.\n\n1\n\nIntroduction\n\nThe problem of designing revenue-optimal auctions based on data has drawn much attention in the\nalgorithmic game theory community lately. Various models have been studied, notably, the sample\ncomplexity model [10, 13, 23, 32, 31, 36, 12, 18, 8] and the online learning model [7]. However,\nthese existing works all implicitly assume that bidders are myopic in the sense that they will faithfully\nreport their valuations as long as the mechanism used in each round is truthful, without considering\nhow their bids may affect 1) the choices of mechanisms and 2) the behaviors of other bidders in future\nrounds in which they may also participate. What happens in the presence of non-myopic bidders?\n\nExample. Suppose a seller has a fresh copy of the good for sale every day, where its value for\nany bidder is bounded between 0 and 1. The seller sets a price at the beginning of each day.\nThen, a bidder (say, randomly chosen from a large yet \ufb01nite pool of potential bidders) arrives and\nsubmits a bid: If the bid is higher than the price of the day, he gets the item and pays the price.\nFurther, suppose the seller adopts the solution proposed by the sample complexity literature and\ndecides to set the price at 0.5 on day 1, and in each of the following days to use the empirical\nprice, i.e., the best \ufb01xed price w.r.t. the bids in previous days.\n\nIf bidders are myopic, bidding truthfully is a dominant strategy since the mechanism on each day is\neffectively posting a take-it-or-leave-it price. As a result, the seller will be able to converge to the\noptimal \ufb01xed price w.r.t. to the pool of potential bidders.\nIf bidders are non-myopic, however, their strategies are more intriguing. A bidder may underbid\nwhenever the empirical price of the day (which is deterministic) is higher than his value, inducing\nthe same result of not winning in the current round but leads to lower future prices compared with\n\n\u2217Supported in part by an RGC grant HKU17257516E.\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\ftruthful bid. By the same reasoning, even if the bidder wants to win the current copy of the good, he\nwill not bid truthfully; instead, he will submit a bid that equals the current price. Due to the strategic\nplays of non-myopic bidders, the seller may in fact converge to a price close to 0. The take-away\nmessage of this example is that directly applying existing algorithms to scenarios where bidders are\nnon-myopic could be a disaster in terms of revenue. Hence, we ask:\n\nAre there learning algorithms that work well even in the presence of non-myopic agents?\n\nIn particular, can we extend the online learning model to allow non-myopic bidders, and design\nalgorithms that provably have small regret against the best \ufb01xed auction?\n\n1.1 Our Results and Techniques\nOur main contribution is a positive answer to the above question, subject to one of the following\ntwo assumptions: Either the bidders are impatient in the sense that they discount utility in future\nrounds by some discount factor (mildly) bounded away from 1 (impatient bidders), or any bidder\ncomes in only a small fraction of the rounds (large market). The assumptions are relatively natural,\nand are further necessary: If the same bidder appears everyday without discounting future utility, no\nalgorithms can guarantee non-trivial regret (e.g., Theorem 3 of Amin et al. [3]).\nSingle-bidder. Let us \ufb01rst consider the case in the example, where only a single bidder comes on\neach day. We show the following:\nInformal Theorem 1. For any \u03b1 \u2208 (0, 1), our online pricing algorithm has regret at most \u03b1T\nagainst non-myopic bidders when T \u2265 \u02dcO(\u03b1\u22124), and either impatient bidders or large market holds.2\nThis is equivalent to a sub-linear \u02dcO(T 3/4) regret bound. Here, we omit in the big-O notation a term\nthat depends on either the discount factor or the maximum number of rounds that a bidder can appear.\n(It holds when the number of rounds that a bidder can participate in is at most \u03b14T , or when the\ndiscount factor is at most 1 \u2212 1\n\u03b14T . See Theorem 3.1 for the formal statement.) In addition, we omit\nlog factors in the \u02dcO notation.\nTypical mechanism design approach may seek to design online learning mechanisms such that\ntruthful biddings form an equilibrium (or even better, are dominating strategies) even if bidders are\nnon-myopic, and to get small regret given truthful bids. However, designing truthful mechanisms that\nlearn in repeated auctions seems beyond the scope of existing techniques.\nWe take an alternative path by relaxing the incentive property: We aim to ensure that bidders would\nonly submit \u201creasonable\u201d bids within a small neighborhood of the true values. Note that the notion of\nregret is robust to small deviations of bids. Applying an online learning algorithm on \u201creasonable\u201d\nbids (instead of truthful bids) results in only a small increase in the regret.\nTo explain how to achieve the relaxed incentive property, \ufb01rst consider why a bidder would lie. Since\nthe single-round auctions are truthful, a bidder can never gain in the current round by lying. Lying is\npreferable only if the future gain outweighs the current loss (if any). The seller\u2019s algorithm in the\nexample, i.e., using empirical prices, suffers on both ends. On one hand, lying has no cost when bid\nand true value are both greater (or both smaller) than the price. On the other hand, the current bid has\nhuge in\ufb02uence on future prices; in particular, the \ufb01rst bid dictates the price in the second round.\nWe will design online auctions such that (1) deviating too far from the true value in the current\nround is always costly, and (2) the in\ufb02uence of the current bid on future prices/utilities is bounded.\nAchieving the \ufb01rst property turns out to be easy. Note that lying has a cost whenever the price falls\nbetween the bid and true value. On each day, with some small probability our mechanisms pick the\nprice randomly to ensure that the price has a decent probability to fall between the value and any\n\u201cunreasonable\u201d bid that deviates a lot.\nThe second property may seem trivial through the following incorrect argument: Online learning\nalgorithms, e.g., multiplicative weight, are intrinsically insensitive to the bid on any single day and,\nthus, satisfy the second property. The argument incorrectly assumes subsequent bids will remain the\nsame regardless of the current bid, omitting that they are controlled by strategic bidders and, thus, are\naffected by the current bid through its in\ufb02uence on subsequent prices. We use an implementation of the\nfollow-the-perturbed-leader algorithm based on the tree-aggregation technique [2] from differential\n2Here, T \u2265 \u02dcO(\u03b1\u22124) means that for any \u03b1, there exits some function h(\u03b1) = \u02dcO(\u03b1\u22124) such that the theorem\n\nholds when T \u2265 h(\u03b1).\n\n2\n\n\fn\n\nm\u03b14.5 ), (2) m \u2265 \u02dcO(\n\n\u221a\n\u03b13 ), (3) either impatient bidders or large market holds.\n\nprivacy. Note that due to the above reasoning, differential privacy does not imply the second property.\nNevertheless, we show that the algorithm in fact satis\ufb01es a slightly stronger guarantee than differential\nprivacy, which is suf\ufb01cient for our proof.\nOur techniques further allow us to get essentially the same bound even in the bandit setting, i.e., seller\nobserves if the bidder buys the copy but not his bid. This is shown in full version.\nMulti-bidder. Our approach can be extended to obtain positive results with n > 1 bidders and m > 1\ncopies of the good per round, with some extra ingredients we shall explain shortly.\nInformal Theorem 2. For any \u03b1 \u2208 (0, 1), our algorithm runs an approximate version of Vickrey\nwith an anonymous reserve price on each day with regret at most \u03b1mT against the best \ufb01xed reserve\nprice if (1) T \u2265 \u02dcO( n\nSimply running a follow-the-perturbed-leader algorithm with tree-aggregation as in the single-bidder\ncase does not work in the multi-bidder setting because a bidder\u2019s current bid can now affect other\nbidders\u2019 subsequent bid through the allocations and payments in the current round. We need m > 1\nin the multi-bidder setting due to the use of joint differential privacy to control the in\ufb02uence among\nbidders. It is known this is necessary for jointly differentially private algorithms to get non-trivial\napproximation (e.g., [22]).\nTo make our approach work in the multi-bidder setting, we need two more ingredients. First, we\nneed to re\ufb01ne our model to control what information are revealed to the bidders in the single-round\nauctions. At the end of each round, each bidder can observe his own outcome, i.e., whether he wins a\ncopy of the good and his payment. At any round, a bidder\u2019s (randomized) bid can depend on his own\nbids and outcomes in previous rounds, but not those of the other bidders. The information structure\nplays a crucial role in the argument of bidders\u2019 incentives.\nThen, it boils down to bound the in\ufb02uence of a bidder\u2019s bid on others bidders\u2019 outcomes in the same\nround. This is exactly the main feature of joint differentially privacy. After choosing the reserve\nprice in each round, we run an approximate Vickrey with reserve as follows. First, run the jointly\nprivate algorithm of [20] to get a set of roughly m candidate bidders and an approximate Vickrey\nprice. Then, for each candidate bidder, offer a take-it-or-leave-it price that equals the maximum of\nchosen reserve price and the approximate Vickrey price. The joint privacy of single-round auctions\ntogether with the previous argument on the learning process bound how much a bidder\u2019s current\nbid can affect his future utility. Finally, the approximation guarantees of the joint private algorithm\nensure that the revenue loss is bounded compared with running Vickrey with the same reserve.\n\n1.2 Related Work\n\nThere is a vast literature on revenue optimal auction design. We discuss only the most related single-\nparameter setting. Myerson [33] showed that optimal auctions are (ironed) virtual surplus maximizers.\nIf the bidders\u2019 value distributions are i.i.d. and regular, the optimum auction is a Vickrey auction with\na reserve price that equals the monopoly price of the distribution. Further, even if distributions are\nnot i.i.d., a Vickrey auction with a suitable reserve still gets a constant approximation [19].\nCole and Roughgarden [10] studied the sample complexity of optimal auctions, and showed upper\nand lower bounds polynomial in (the inverse of) the error term \u03b1 and the number of bidders n.\nBubeck et al. [7] revisited the problem in an online-learning model and introduced algorithms that\nsimultaneously achieve near optimal regret against arbitrary bidder values, improving previous results\nBlum et al. [6], Blum and Hartline [5], Kleinberg and Leighton [27], and near optimal sample\ncomplexity if values are drawn from a underlying distribution. These works implicitly assume myopic\nbidders so either previous bids are truthful when previous auctions are truthful, or an approximation\nof the prior distribution can be estimated from the bids in non-truthful previous auctions [34]. This\npaper takes a more proactive approach of investigating how to design the learning process with the\nauction to extract meaningful information even if bidders are non-myopic.\nOur results build on two lines in differential privacy, namely, differentially private online learning\n\u221a\nalgorithms, and jointly differentially private algorithms. Agarwal and Singh [2] introduced an (\u03b5, \u03b4)-\ndifferentially private algorithm with regret \u02dcO(\nK/\u03b5) for full information setting (a.k.a. the\nexpert problem), and regret \u02dcO(\nT K/\u03b5) for bandit setting. Here, T is the number of rounds and K\nis the number of experts/arms. Independently, there is a same result for bandit setting in [37]. Joint\n\nT +\n\n\u221a\n\n\u221a\n\n3\n\n\fdifferential privacy [26] is a a relaxation of differential privacy [15, 14], it can be applied to many\ncombinatorial problems for which no differentially private algorithm gets non-trivial approximation.\nHsu et al. [20] introduced the billboard lemma as the cornerstone of joint differential privacy.\nPrevious Models of Repeated Auctions with Non-myopic Bidders. Previous works [11, 24]\nstudied the equilibria of repeated sales when seller cannot commit to a pricing strategy and, thus,\nmust play according to a perfect Bayesian equilibrium. In contrast, we adopt the standard assumption\nthat the seller as a mechanism designer can commit to a strategy upfront. Further, their models\nassume the same bidder comes every day with value drawn from a prior distribution, while our model\nassumes no prior, allows different bidders on different days, and the same bidder to have distinct\nvalues on different days. Amin et al. [4, 3] considered a stochastic version with the same bidder\ncoming every day, and proposed algorithms with sub-linear regrets, there are better bounds in special\ncase when bidder has the same value on different days [30]. We stress that our model is more general\nas it assumes no prior and allows different bidders on different days, thus, brings a lot of challenges.\nMirrokni et al. [29] also considered repeated auctions with a different model and a different objective\ncompared with ours. They assume prior distributions while we do not, they focus on designing\nincentive compatible mechanisms while we have a fundamentally different philosophy of achieving\nnon-trivial learning in the equilibrium, which, to our knowledge, has not been considered in the\ndynamic mechanism design literature before.\nPrevious Applications of Differential Privacy in Mechanism Design. Although differential pri-\nvacy has been applied to mechanism design before, its role in previous work (e.g, [28, 17, 21, 20, 22])\nis fundamentally different from that in ours. First, most previous work achieved approximate incen-\ntive compatibility so that misreporting cannot increases a bidder\u2019s utility by more than an \u0001 amount.\nFurther, such mechanisms can be coupled with a strictly truthful mechanism to achieve exact incentive\ncompatibility in some speci\ufb01c problems [35]. In contrast, our work uses techniques from differential\nprivacy (rather than the concept itself) to control the in\ufb02uence of a bid in any single round on future\nutility. Then, we can bound the deviation of a bidder from true value in equilibria (instead of the\namount of incentive to deviate). Hence, our approach is conceptually different.\nSecond, previous work generally used differential privacy to design one-shot mechanisms, while our\nwork considers repeated auctions. Characterizing bidder\u2019s behaviors is notoriously hard, a single\nbidder\u2019s deviation in a single round may have the cascading effect of changing the bids of all bidders\nin subsequent rounds. To this end, we propose to use joint differential privacy as a mean to control\ninformation dissemination and, consequently, bidders\u2019 behaviors in future rounds in repeated auctions.\nThis is a novel application of joint differential privacy to our knowledge.\nIn concurrent and independent work, Epasto et al. [17] considered incentive-aware learning and\nused differential privacy to control the amount of a bidder\u2019s deviation from his true value using an\napproach similar to ours. However, their work focused on a one-shot interaction environment while\nours consider repeated auctions. The results are therefore incomparable.\n\n2 Preliminary\n\n2.1 Single-bidder Model\n\nLet there be a seller who has a fresh copy of the good for sale every day for a total of T days. Exactly\none bidder comes on each day, but the same bidder may show up in multiple days. We assume that a\nbidder can come on at most \u03c4 days for some \u03c4 \u2264 T . Consider the following interactions between the\nseller and bidders. On each day t \u2208 [T ]:\n\n1. Seller sets a price pt \u2208 [0, 1] as a (randomized) function of previous bids b1, . . . , bt\u22121.\n2. A bidder arrives with value vt \u2208 [0, 1] and submits a bid bt \u2208 [0, 1] as a (randomized)\nfunction of his value vt, his bids and auction outcomes in the previous rounds that he\nparticipates in.\n\n3. Seller observes the bid bt but not the value vt.\n4. Bidder receives the good and pays pt if bt \u2265 pt; nothing happens otherwise.\n\nHere, it is crucial to assume that a bidder does not observe the bids and auction outcomes of the\nrounds in which he does not participate.\n\n4\n\n\fRational Bidders. A bidder\u2019s utility in a single round is quasi-linear, namely, vt \u2212 pt if he gets the\ngood and 0 otherwise. For some discount factor \u03b3 \u2208 [0, 1], a bidder discounts future utility by at least\n\u03b3 and seeks to maximize the sum of discounted utilities. For example, suppose a bidder comes on\ndays t1, t2, and t3. When the bidder considers his strategy on day t1, he would sum up his utilities\nfrom all three days, discounting future utility on day t2 by at least \u03b3, and that on day t3 by at least \u03b32.\nIf \u03b3 = 0, it becomes the model with myopic bidders. If \u03b3 = 1, bidders simply seek to maximize the\nsum of their utilities. Note that we do not assume that the values of the same bidder must be the same\non different days (although they could be).\n\nSeller. As a mechanism designer, the seller can commit to a mechanism, i.e., \ufb01xing the (randomized)\npricing functions p1, . . . , pT upfront. Hence, we shall interpret the T -round interactions as a game\namong the bidders, with the seller designing (part of) the rules. The seller aims to maximize revenue,\ni.e., the sum of the prices payed by the bidders over all T rounds, denoted as ALG.\nWe adopt the standard regret analysis of online learning and compare ALG with the optimal \ufb01xed\nt\u2208[T ] 1vt\u2265p.3 We denote by OPT({vt}t) the revenue of\nthe best \ufb01xed price w.r.t. a given sequence of values {vt}t. The regret of the algorithm is therefore\nOPT({vt}t) \u2212 ALG. We will further split the regret into two parts as follows in our analysis:\n\nprice in hindsight, namely, maxp\u2208[0,1] p \u00b7(cid:80)\n\nOPT({vt}t) \u2212 ALG = OPT({vt}t) \u2212 OPT({bt}t)\n\n+ OPT({bt}t) \u2212 ALG\n\n(cid:125)\n\n(cid:124)\n\n(cid:123)(cid:122)\n\n(cid:125)\n\n(cid:124)\n\n(cid:123)(cid:122)\n\ngame-theoretic regret\n\nlearning regret\n\nAssumptions. We consider instances that satisfy one of the following two assumptions. The\nhardness result by Amin et al. [3] implies that no non-trivial regret is possible if neither holds.\n\n\u2022 Large-market: No bidder participates in a signi\ufb01cant portion of rounds, i.e., \u03c4 = o(T ).\n\u2022 Impatient bidder: \u03b3 is (mildly) bounded away from 1, i.e.,\n\n1\n\n1\u2212\u03b3 = o(T ).\n\n2.2 Multi-bidder Model\n\nThe model extends straightforwardly to multi-bidder setting. We sketch the model below and highlight\na few key assumptions. The seller has m fresh copies of the good for sale every day for a total\nof T days. n buyers come on each day and a bidder can show up on at most \u03c4 \u2264 T days. The\nlarge-market and impatient bidder assumptions are also applied. Again, a bidder cannot observe the\nauction outcomes of the rounds in which he does not participate. Further, a bidder cannot observe\nthe auction outcomes of the other bidders, i.e., who gets a copy of the good and how much they pay,\neven if he participates in that round. Both assumptions on the information structure are crucial for\nour incentive argument.\nBidders are rational and seek to maximize the sum of their utilities. Seller aims to maximize the total\nrevenue of all T rounds, denoted as ALG. The benchmark, however, is not the revenue of the best\n\ufb01xed arbitrary auction. Instead, we will compare with the revenue of the best \ufb01xed auction within a\ncertain family. In this paper, the family of Vickrey auctions with an anonymous reserve price, denote\nas OPT({vt}t).\nIn online learning, the algorithm usually has the same strategy space as the of\ufb02ine benchmark. Our\nmodel, however, allows the seller to use auctions outside the family of benchmark auctions. We stress\nthat our algorithm uses this \ufb02exibility only to implement approximate versions of Vickrey auctions\nwith reserves. Hence, the benchmark is still meaningful.\nOne can ask the same learning question about other families of auctions, e.g., learning the best\nanonymous Myerson-type auctions as in Roughgarden and Schrijvers [36] or the best Myerson-type\nauctions as in Devanur et al. [12]. Extending the techniques in this paper to handle these more\ncomplicated auction formats against non-myopic bidders is another interesting future direction.\n\n2.3 Differential Privacy Preliminaries\n\nOur techniques rely on the notion of differential privacy by Dwork et al. [15, 14], and its relaxation\ncalled joint differential privacy by Kearns et al. [26]. We include the formal de\ufb01nitions as follows.\n\n3Note that a bidder\u2019s best strategy against a \ufb01xed price is truthful bidding.\n\n5\n\n\fDe\ufb01nition 2.1 (Differential Privacy [15, 14]). An algorithm A : Cn (cid:55)\u2192 R is (\u0001, \u03b4)-differentially\nprivate if for all S \u2286 R and for all neighboring datasets D, D(cid:48) \u2208 Cn that differ in one entry, there is\nPr[A(D) \u2208 S] \u2264 e\u0001 Pr[A(D(cid:48)) \u2208 S] + \u03b4.\nDe\ufb01nition 2.2 (Joint Differential Privacy [26]). An algorithm A : Cn \u2192 Rn is (\u03b5, \u03b4)-jointly\ndifferentially private if for any i, any D, D(cid:48) \u2208 Cn differ only in the i-th entry, and any S \u2208 Rn\u22121,\nthere is Pr[A(D)\u2212i \u2208 S] \u2264 e\u03b5 Pr[A(D(cid:48))\u2212i \u2208 S] + \u03b4.\n\n3 Single-bidder Case: An Overview\n\nFollowing the treatment of Bubeck et al. [7], we restrict our attentions to prices that are multiples of\n\u03b1 and treat each of such prices as an expert in an online learning problem. Let K = 1\n\u03b1 + 1 denote\nthe number of decretized prices. Consider an expert problem with K experts with the i-th expert\ncorresponding to price (i \u2212 1)\u03b1. We will assume without loss that bids fall into the discretized price\nset. The bid on day t, bt, induces a gain vector gt such that the gain of the i-th expert is (i \u2212 1)\u03b1\n\nif bt \u2265 (i \u2212 1)\u03b1 and 0 otherwise. That is, gt =(cid:0)0, \u03b1, 2\u03b1, . . . ,(cid:98) bt\naccumulative gain vector up to time t as Gt =(cid:80)\nT \u2265 \u02dcO(cid:0)\u03c4 \u03b1\u22124(cid:1) under the large market assumption, or T \u2265 \u02dcO(cid:0) \u03b1\u22124\nof T \u2265 \u02dcO(cid:0)\u03c4 \u03b1\u22124.5(cid:1) and T \u2265 \u02dcO(cid:0) \u03b1\u22124.5\n\n\u03b1 (cid:99)\u03b1, 0, . . . , 0(cid:1). Further denote the\n(cid:1) under the impatient bid-\n(cid:1) for intuition. We will then prove the better bounds with a\n\nTheorem 3.1. For any \u03b1 \u2208 (0, 1), there is an online algorithm with regret O(\u03b1T ) when\n\nHere, we \ufb01rst present a simpli\ufb01ed algorithm that gets the regret bound under the stronger assumptions\n\nder assumption.\n\nj\u2208[t] gj.\n\n1\u2212\u03b3\n\n1\u2212\u03b3\n\nmore complexed algorithm (similar key ideas) in full version.\n\n3.1\n\n(Simpli\ufb01ed) Algorithm\n\nTree-aggregation. The simpli\ufb01ed algorithm is a privacy-preserving version of the followed-the-\nperturbed-leader algorithm based on the tree-aggregation technique [16, 9]. Since our analysis needs\nto make use of the structure of the algorithm, it is worthwhile to devote a few paragraphs to formally\nde\ufb01ne the tree-aggregation subroutine.\nSuppose we have T elements (the experts\u2019 gains) and need to calculate the cumulative sum of\nelements from 1 to t for any t \u2208 [T ] in a differentially private manner. The na\u00efve approach simply\ncalculates the cumulative sums and add, say, Gaussian noise, to each of them. Since an element\nmay appear in all T cumulative sums, the noise scale is \u02dcO(\nT /\u0001) by a standard argument. Instead,\nthe tree-aggregation technique calculates T partial sums such that (1) each element appears in at\nmost log T partial sums, and (2) each cumulative sum is the sum of at most log T partial sums. This\ntechnique signi\ufb01cantly reduces the noise scale to \u02dcO(1/\u0001).\nNext, we explain how to design the partial sums. Consider any t \u2208 [T ] with binary representation\n\n\u221a\n\n(tlog T . . . t1t0)2, i.e., t =(cid:80)log T\n\nTo compute the sum of the \ufb01rst t elements, it suf\ufb01ces to sum up the following sets of partial sum\nobtained by removing the non-zero bits of t one by one from lowest to highest:\n\nj=0 tj \u00b7 2j. Let jt be the lowest non-zero bit. The t-th sum is over\n\n\u039bt =(cid:8)t \u2212 2jt + 1, t \u2212 2jt + 2, . . . , t \u2212 1, t(cid:9) .\nj=0 tj2j, h = 0, 1, . . . , log T(cid:9)\n\n\u0393t =(cid:8)t(cid:48) (cid:54)= 0 : t(cid:48) = t \u2212(cid:80)h\u22121\n\nThen, we have [t] = \u222aj\u2208\u0393t\u039bj.\nFor example, suppose t = 14 = 1 \u00b7 21 + 1 \u00b7 22 + 1 \u00b7 23. Then, we have \u039bt = {13, 14} and\n\u0393t = {14, 12, 8}. The tree-aggregation subroutine is given in Algorithm 1.\nThe usual description of tree-aggregation (e.g., [2]) computes the partial sum At in one-shot at step\nt while ours considers At\u2019s as internal states that are maintained throughout the algorithm. Both\ndescriptions result in the same algorithm but ours is more convenient for our proof.\nLemma 3.2 (Jain et al. [25]). The \ufb01nal values of the internal states At\u2019s are (\u0001, \u03b4 = \u0001\nprivate with \u03c3 = 8\n\nT )-differentially\n\n(cid:113)\n\nlog T\n\n\u221a\n\nK\n\n\u03b5\n\nln log T\n\u03b4\n\n.\n\n6\n\n\fAlgorithm 1 Tree-aggregation\n1: input: dimension K, gain vector gt \u2208 [0, 1]K of each round t, noise scale \u03c3\n2: internal states: noisy partial sum At for t \u2208 [T ]\n3: initialize: At = \u00b5t for all t \u2208 T , with \u00b5tj\u2019s i.i.d. from normal distribution N (0, \u03c32).\n4: for t = 1, 2, . . . , T do\nReceive gt as input.\n5:\nLet Aj = Aj + gt for all j s.t. t \u2208 \u039b(j).\n6:\n7:\n8: end for\n\nAj + \u03bdt, with \u03bdtj\u2019s i.i.d. from N(cid:0)0, (log T + 1 \u2212 |\u0393t|)\u03c32(cid:1).\n\nOutput \u02dcGt =(cid:80)\n\nj\u2208\u0393t\n\nWe need a slightly stronger version that is a simple corollary.\nLemma 3.3. Fix any t0 \u2208 [T ], the values of the internal states At\u2019s at the end of round t0 are\n(\u0001, \u03b4 = \u0001\n\n\u221a\nT )-differentially private for bids on or before day t0 with \u03c3 = 8\n\nln log T\n\u03b4\n\nlog T\n\nK\n\n.\n\n\u03b5\n\n(cid:113)\n\nProof. The values of At\u2019s at the end of round t0 are effectively the values if subsequent gain vectors\nare all zero. Hence, the lemma follows as a corollary of Lemma 3.2.\n\nOnline Pricing Algorithm. Our algorithm (Algorithm 2) is a variant of the privacy-preserving online\nlearning algorithm of [2]. It uses tree-aggregation as a subroutine for maintaining an noisy version\nof the cumulative gains of each price. On each day t, with some small probability it picks the price\nrandomly; otherwise, it picks the price with the largest noisy cumulative gain in previous days, i.e,\nthe largest entry of \u02dcGt\u22121. In addition, due to step 7 in Algorithm 1, we can obtain that \u02dcGtj \u2212 Gtj\nfollows N (0, (log T + 1)\u03c32) for any t \u2208 [T ] and any j \u2208 [K].\n\n\u03b1 + 1, privacy parameter \u0001, \u03b4 = \u0001\nT .\n\nAlgorithm 2 Online Pricing (Single-bidder Case)\n1: parameters: regret parameter \u03b1, K = 1\n2: initialize tree-aggregation (Alg. 1) with \u03c3 = 8\nlog T\n3: for t = 1, . . . , T do\n4: With probability \u03b1, pick j \u2208 [K] uniformly at random.\n5:\n6:\n7:\n8: end for\n\nOtherwise, pick j that maximizes \u02dcG(t\u22121)j.\nSet price (j \u2212 1)\u03b1.\nObserve bid bt and, thus, the gain vector gt; update tree-aggregation.\n\nln log T\n\u03b4\n\n.\n\n(cid:113)\n\n\u221a\n\nK\n\n\u03b5\n\n3.2 Bound Learning Regret\n\nLemma follows from Theorem 8 of [1].\nLemma 3.4. Consider running Algorithm 2 without step 4. Then, the learning regret w.r.t. the best\n\n\ufb01xed discretized price is at most O(cid:0)\u221a\nCorollary 3.5. The learning regret of Algorithm 2 is O(cid:0)\u221a\n\nlog T )(cid:1).\n\nlog T ) + \u03b1T(cid:1).\n\n\u221a\nlog T + T\n\n\u221a\nlog T + T\n\nlog K(\u03c3\n\nlog K(\u03c3\n\n\u221a\n\n\u221a\n\n\u03c3\n\n\u03c3\n\nProof. Running Algorithm 2 with step 4 increases the regret by at most \u03b1T . Further note that the\nregret w.r.t. to the best \ufb01xed discretized price differs from the actual regret by at most \u03b1T .\n\n3.3 Bounding Game-theoretic Regret: Stability of Future Utility\nWe prove the bound by showing each bidder won\u2019t deviate far, i,e, |bt \u2212 vt| \u2264 2\u03b1 as Lemma 3.7\n(proof in full version). This is because lying is lower bounded in the current round by step 4, and the\nextra utility in the future is upper bounded by Lemma 3.6.\nLemma 3.6 (Stability of Future Utility). For any bidder and any day t on which he comes, the\nbidder\u2019s equilibria utilities in subsequent rounds in the subgames induced by different bids on day t\ndiffer by at most an e\u0001 multiplicative factor plus a \u03b4T additive factor.\n\n7\n\n\fReaders may think of perfect Bayesian equilibrium as a concrete solution concept to understand the\nlemma. However, note that the problem is de\ufb01ned without Bayesian priors, and the lemma holds for\na much more general, yet non-standard, solution concept. Our proof shows that no matter what a\nbidder\u2019s belief is on the other bidders\u2019 values and strategies, assuming she always plays best-response\nunder her belief in subsequent rounds, the future utility differ by at most an e\u0001 multiplicative factor.\nThis is the main argument of our approach. First consider a seemingly intuitive yet incorrect proof.\nSince tree-aggregation is differentially private, the online pricing algorithm is also private treating the\nbid on each day as an entry of the dataset. The lemma holds because changing the bid on a day leads\nto a neighboring dataset and, thus, the probability of any subset of future outcomes does not change\nmuch. This is incorrect because subsequent bids are controlled by strategic bidders. Changing the bid\non a day does not result in a neighboring dataset in general.4\n\nProof. We shall abuse notation and refer to the bidder that comes on day t as bidder t. Fix any bidder\nt\u2019s strategy for subsequent days (after round t). That is, \ufb01xed the (randomized) bidding function on\nany subsequent day t(cid:48) as a function only on his bids and auction outcomes between day t (exclusive)\nand t(cid:48) (exclusive). Let us consider the resulting utilities for bidder t in the subgames induced by\ntwo distinct bids on day t. Note that the other bidders\u2019 subsequent strategies will be the same in\nthe subgames since they cannot observe what happens on day t. We shall interpret the execution of\nthe online pricing algorithm, i.e., the algorithm together with the bidders\u2019 strategies in subsequent\nrounds, after round t as a post-processing on the internal states of the tree-aggregation algorithm at\nthe end of day t and, thus, is (\u0001, \u03b4)-differentially private due to Lemma 3.3. Therefore, the utilities of\nany \ufb01xed subsequent strategy of bidder t in the two subgames differ by at most an e\u0001 multiplicative\nfactor plus a \u03b4T additive factor. The lemma then follows by the equilibria condition that bidder t\nemploys the best subsequent strategy in any subgame. In fact, if the bidder on day t uses two different\narbitrary strategies in subsequent subgames depending on his different bids on day t, indeed it is\nimpossible to bound the change in his utility. However, we assume that the bidder is rational so his\nstrategy in subsequent rounds satis\ufb01es equilibria conditions. Then, it suf\ufb01ces to show that for any\n\ufb01xed subsequent strategy, the future utility does not change much, because the equilibria utility is\nsimply taking a max over all possible subsequent strategies.\nLemma 3.7. \u2200t, we have |bt \u2212 vt| \u2264 2\u03b1, and then the game-theoretic regret is bounded by 2\u03b1T for\n\n\u2022 \u03b1 = (4\u03c4 \u0001)1/3 under the assumption of large market; or\n\u2022 \u03b1 = ( 4\u0001\n\n1\u2212\u03b3 )1/3 under the assumption of impatient bidders.\n\n3.4 Bounding Total Regret\n\n\u221a\n\nWe prove the regret under the large market assumption. The case of impatient bidders is almost\nidentical. Putting together Corollary 3.5 and Lemma 3.7, the regret of Algorithm 2 is at most\n\nO(cid:0)\u221a\nlog T )+\u03b1T(cid:1) for \u03b1 = (4\u03c4 \u0001)1/3. This means that we shall set \u0001 = \u0398(\u03b13/\u03c4 )\nand, thus, \u03c3 = \u02dc\u0398(cid:0)\u221a\nK/\u03b5(cid:1) = \u02dc\u0398(cid:0)\u03c4 \u03b1\u22123.5(cid:1). So the 2nd term in the above regret bound is negligible\ncompared to \u03b1T . The regret bound becomes \u02dcO(cid:0)\u03c4 \u03b1\u22123.5(cid:1) + O(cid:0)\u03b1T(cid:1) \u2264 O(cid:0)\u03b1T(cid:1) if T \u2265 \u03c4 \u03b1\u22124.5.\n\n\u221a\nlog T + T\n\nlog K(\u03c3\n\n\u03c3\n\n4 Multi-bidder Case: An Overview\nWe take the same online learning formulation as in the single-bidder case, treating each discretized\nprice that is a multiple of \u03b1 between 0 and 1 as an expert. Expert j\u2019s gain on any day is the revenue\nof Vickrey auction with reserve price (j \u2212 1)\u03b1 w.r.t. the bids on that day. We sketch the main ideas\nbelow, and present the proof in full version.\nTheorem 4.1. For any \u03b1 \u2208 (0, 1), our algorithm runs an approximate version of Vickrey with an\nanonymous reserve price on each day with regret \u2264 \u03b1mT against the best \ufb01xed reserve price if:\n\n1. T \u2265 \u02dcO(cid:0) \u03c4 n\n2. T \u2265 \u02dcO(cid:0)\n\n(cid:1), and m \u2265 \u02dcO(\n\n(cid:1), and m \u2265 \u02dcO(\n\nm\u03b14.5\nn\n(1\u2212\u03b3)m\u03b14.5\n\n\u221a\n\u03b13 ) given large market; or\n\n\u03c4 n\n\n\u221a\nn\u221a\n1\u2212\u03b3\u03b13 ) given impatient bidders.\n\n4Although other bidders do not see what happens on day t, thus, will employ the same strategies in subsequent\n\ndays, the actual bids are affected also by the seller\u2019s subsequent prices, which is affected by the bid on day t.\n\n8\n\n\fAlgorithm. With some small probability we randomly pick a subset of bidders and offer each of\nthem a copy of the good with a random price to ensure lying is costly in the current round. We pick\nthe reserve price on each day using follow-the-perturbed-leader implemented with tree-aggregation.\nSimply running Vickrey with the chosen reserve price does not guarantee stability of future utility,\nhowever, because a bidder\u2019s current bid can now affect other bidders\u2019 subsequent bids through the\nallocations and payments in the current round. Instead, we use a private allocation algorithm of [20]\nto get a set S and a price p that are approximations of the set of top-m bidders and Vickrey price.\n\nT , E = \u02dcO(cid:0) 1\n\n\u03b12\u0001\n\n(cid:1).\n\n\u03b1 + 1, privacy parameter \u0001, \u03b4 = \u0001\nln log T\n\u03b4\n\nK log T\n\n\u221a\n\n\u03b5\n\n(cid:113)\n\nAlgorithm 3 Online Pricing (Multi-bidder Case)\n1: input: regret parameter \u03b1, K = 1\n2: initialize tree-aggregation with noise scale \u03c3 = 8\n3: for t = 1, . . . , T do\n4: With probability \u03b1, pick a subset S \u2286 [n] of size m and j \u2208 [K] uniformly at random.\n5:\n6:\n7:\n8:\n9:\n10:\n11: end for\n\nPick j1 that maximizes \u02dcG(t\u22121)j from tree-aggregation.\nRun PMatch(\u03b1, \u03c1 = \u03b1, \u0001) [20] to get a set S of \u2264 m \u2212 E bidders and a price p = j2\u03b1.\nLet j = max{j1 \u2212 1, j2}.\n\nOffer a copy of good to each i \u2208 S at price (j \u2212 1)\u03b1.\nObserve bid vector bt; update tree-aggregation with the normalized gain vector 1\n\nOtherwise:\n\nm gt.\n\n.\n\nStability of Future Utility. It is similar to the single-bidder case. Consider the bidder on some \ufb01xed\nday t and the subgames induced by two distinct bids on day t. Fix any subsequent strategy of bidder t.\nThe subsequent execution of the algorithm can be viewed as a post-processing on the internal states\nof tree-aggregation together with the other bidders\u2019 memberships w.r.t. S and the price p on day t,\nwhich are differentially private due to Lemma 3.3 and the privacy property of PMatch in [20].\nRegret. The main extra ingredient from single-bidder is the revenue of approximate implementation\nof Vickrey with reserve does not deviate much from that of the exact implementation.\nLemma 4.2 (Hsu et al. [20]). The set of bidders S and the price p satisfy:\n\n1. m \u2212 2E \u2264 |S| \u2264 m \u2212 E;\n2. all bidders in S have values at least p \u2212 \u03b1;\n3. at most E bidders outside S have values at least p.\n\nLemma 4.3. For any j\u2217 \u2208 [K], the revenue of running Vickrey with reserve p\u2217 = (j\u2217 \u2212 1)\u03b1 is no\n\nmore than that of running steps 7-9 in Algorithm 3 with j1 = j\u2217 plus O(cid:0)E + \u03b1m(cid:1).\n\nProof. Suppose p(cid:48) = j(cid:48)\u03b1 is the (m + 1)-th highest bid. The winners in Vickrey pays max{p\u2217, p(cid:48)}.\nWith j1 = j\u2217, (j1 \u2212 2)\u03b1 = p\u2217 \u2212 \u03b1. Claims 1 and 3 of Lemma 4.2 imply p \u2265 p(cid:48) and, thus,\n(j2 \u2212 1)\u03b1 \u2265 p(cid:48) \u2212 \u03b1. Hence, the price offered in step 9 is at least max{p\u2217, p(cid:48)} \u2212 \u03b1. It remains to\nshow the number of sales by steps 7-9 is less than that of Vickrey by at most O(E). If (j1 \u2212 2)\u03b1 is\noffered in step 9, then the number of sales by the algorithm is at least that of Vickrey with reserver p\u2217\nminus E due to claim 3 of Lemma 4.2. If (j2 \u2212 1)\u03b1 = p \u2212 \u03b1 is offered in step 9, then the number\nof sales is at least m \u2212 2E due to claim 1 and 2 of Lemma 4.2. Hence, it is less than the number of\nsales of Vickrey by at most 2E. In both cases, the lemma follows.\n\nReferences\n[1] Jacob Abernethy, Chansoo Lee, Abhinav Sinha, and Ambuj Tewari. Online linear optimization\n\nvia smoothing. In Conference on Learning Theory, pages 807\u2013823, 2014.\n\n[2] Naman Agarwal and Karan Singh. The price of differential privacy for online learning. In\n\nInternational Conference on Machine Learning, pages 32\u201340, 2017.\n\n[3] Kareem Amin, Afshin Rostamizadeh, and Umar Syed. Learning prices for repeated auctions\nwith strategic buyers. In Advances in Neural Information Processing Systems, pages 1169\u20131177,\n2013.\n\n9\n\n\f[4] Kareem Amin, Afshin Rostamizadeh, and Umar Syed. Repeated contextual auctions with\nstrategic buyers. In Advances in Neural Information Processing Systems, pages 622\u2013630, 2014.\n\n[5] Avrim Blum and Jason D Hartline. Near-optimal online auctions. In Proceedings of the sixteenth\nannual ACM-SIAM symposium on Discrete algorithms, pages 1156\u20131163. Society for Industrial\nand Applied Mathematics, 2005.\n\n[6] Avrim Blum, Vijay Kumar, Atri Rudra, and Felix Wu. Online learning in online auctions.\n\nTheoretical Computer Science, 324(2-3):137\u2013146, 2004.\n\n[7] Sebastien Bubeck, Nikhil R Devanur, Zhiyi Huang, and Rad Niazadeh. Online auctions and\nmulti-scale online learning. In Proceedings of the 2017 ACM Conference on Economics and\nComputation, pages 497\u2013514. ACM, 2017.\n\n[8] Yang Cai and Constantinos Daskalakis. Learning multi-item auctions with (or without) samples.\n\nIn 58th Annual IEEE Symposium on Foundations of Computer Science, 2017.\n\n[9] T-H Hubert Chan, Elaine Shi, and Dawn Song. Private and continual release of statistics. ACM\n\nTransactions on Information and System Security (TISSEC), 14(3):26, 2011.\n\n[10] Richard Cole and Tim Roughgarden. The sample complexity of revenue maximization. In\nProceedings of the forty-sixth annual ACM symposium on Theory of computing, pages 243\u2013252.\nACM, 2014.\n\n[11] Nikhil R Devanur, Yuval Peres, and Balasubramanian Sivan. Perfect bayesian equilibria in\nrepeated sales. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete\nAlgorithms, pages 983\u20131002. SIAM, 2015.\n\n[12] Nikhil R Devanur, Zhiyi Huang, and Christos-Alexandros Psomas. The sample complexity of\nauctions with side information. In Proceedings of the forty-eighth annual ACM symposium on\nTheory of Computing, pages 426\u2013439. ACM, 2016.\n\n[13] Shaddin Dughmi, Li Han, and Noam Nisan. Sampling and representation complexity of revenue\nmaximization. In International Conference on Web and Internet Economics, pages 277\u2013291.\nSpringer, 2014.\n\n[14] Cynthia Dwork. Differential privacy. In International Colloquium on Automata, Languages,\n\nand Programming, pages 1\u201312. Springer, 2006.\n\n[15] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to\nsensitivity in private data analysis. In Theory of Cryptography Conference, pages 265\u2013284.\nSpringer, 2006.\n\n[16] Cynthia Dwork, Moni Naor, Toniann Pitassi, and Guy N Rothblum. Differential privacy under\nIn Proceedings of the forty-second ACM symposium on Theory of\n\ncontinual observation.\ncomputing, pages 715\u2013724. ACM, 2010.\n\n[17] Alessandro Epasto, Mohammad Mahdian, Vahab Mirrokni, and Song Zuo. Incentive-aware\n\nlearning for large markets. In Proceedings of the Web Conference, pages 1369\u20131378, 2018.\n\n[18] Yannai A Gonczarowski and Noam Nisan. Ef\ufb01cient empirical revenue maximization in single-\nparameter auction environments. In Proceedings of the 49th Annual ACM SIGACT Symposium\non Theory of Computing, pages 856\u2013868. ACM, 2017.\n\n[19] Jason D Hartline and Tim Roughgarden. Simple versus optimal mechanisms. In Proceedings of\n\nthe 10th ACM conference on Electronic commerce, pages 225\u2013234. ACM, 2009.\n\n[20] Justin Hsu, Zhiyi Huang, Aaron Roth, Tim Roughgarden, and Zhiwei Steven Wu. Private\n\nmatchings and allocations. SIAM Journal on Computing, 45(6):1953\u20131984, 2016.\n\n[21] Justin Hsu, Zhiyi Huang, Aaron Roth, and Zhiwei Steven Wu. Jointly private convex pro-\ngramming. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete\nAlgorithms, pages 580\u2013599. Society for Industrial and Applied Mathematics, 2016.\n\n10\n\n\f[22] Zhiyi Huang and Xue Zhu. Better jointly private packing algorithms via dual multiplica-\ntive weight update. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete\nAlgorithms. Society for Industrial and Applied Mathematics, 2018.\n\n[23] Zhiyi Huang, Yishay Mansour, and Tim Roughgarden. Making the most of your samples. In\nProceedings of the Sixteenth ACM Conference on Economics and Computation, pages 45\u201360.\nACM, 2015.\n\n[24] Nicole Immorlica, Brendan Lucier, Emmanouil Pountourakis, and Samuel Taggart. Repeated\nsales with multiple strategic buyers. In Proceedings of the 2017 ACM Conference on Economics\nand Computation, pages 167\u2013168. ACM, 2017.\n\n[25] Prateek Jain, Pravesh Kothari, and Abhradeep Thakurta. Differentially private online learning.\n\nIn Conference on Learning Theory, pages 24\u20131, 2012.\n\n[26] Michael Kearns, Mallesh Pai, Aaron Roth, and Jonathan Ullman. Mechanism design in large\ngames: Incentives and privacy. In Proceedings of the 5th conference on Innovations in theoretical\ncomputer science, pages 403\u2013410. ACM, 2014.\n\n[27] Robert Kleinberg and Tom Leighton. The value of knowing a demand curve: Bounds on regret\nfor online posted-price auctions. In Foundations of Computer Science, 2003. Proceedings. 44th\nAnnual IEEE Symposium on, pages 594\u2013605. IEEE, 2003.\n\n[28] Frank McSherry and Kunal Talwar. Mechanism design via differential privacy. In Proceedings\nof the 48th Annual IEEE Symposium on Foundations of Computer Science, pages 94\u2013103. IEEE,\n2007.\n\n[29] Vahab Mirrokni, Renato Paes Leme, Pingzhong Tang, and Song Zuo. Non-clairvoyant dy-\nnamic mechanism design. In Proceedings of the 2018 ACM Conference on Economics and\nComputation, pages 169\u2013169. ACM, 2018.\n\n[30] Mehryar Mohri and Andres Munoz. Optimal regret minimization in posted-price auctions with\nstrategic buyers. In Advances in Neural Information Processing Systems, pages 1871\u20131879,\n2014.\n\n[31] Jamie Morgenstern and Tim Roughgarden. Learning simple auctions.\n\nLearning Theory, pages 1298\u20131318, 2016.\n\nIn Conference on\n\n[32] Jamie H Morgenstern and Tim Roughgarden. On the pseudo-dimension of nearly optimal\n\nauctions. In Advances in Neural Information Processing Systems, pages 136\u2013144, 2015.\n\n[33] Roger B Myerson. Optimal auction design. Mathematics of operations research, 6(1):58\u201373,\n\n1981.\n\n[34] Denis Nekipelov, Vasilis Syrgkanis, and Eva Tardos. Econometrics for learning agents. In\nProceedings of the Sixteenth ACM Conference on Economics and Computation, pages 1\u201318.\nACM, 2015.\n\n[35] Kobbi Nissim, Rann Smorodinsky, and Moshe Tennenholtz. Approximately optimal mechanism\ndesign via differential privacy. In Proceedings of the 3rd Innovations in Theoretical Computer\nScience conference, pages 203\u2013213. ACM, 2012.\n\n[36] Tim Roughgarden and Okke Schrijvers. Ironing in the dark. In Proceedings of the 2016 ACM\n\nConference on Economics and Computation, pages 1\u201318. ACM, 2016.\n\n[37] Aristide C. Y. Tossou and Christos Dimitrakakis. Achieving privacy in the adversarial multi-\n\narmed bandit. In AAAI, pages 2653\u20132659, 2017.\n\n11\n\n\f", "award": [], "sourceid": 1022, "authors": [{"given_name": "Jinyan", "family_name": "Liu", "institution": "The University of Hong Kong"}, {"given_name": "Zhiyi", "family_name": "Huang", "institution": "The University of Hong Kong"}, {"given_name": "Xiangning", "family_name": "Wang", "institution": "The University of Hong Kong"}]}