{"title": "Bounded-Loss Private Prediction Markets", "book": "Advances in Neural Information Processing Systems", "page_first": 10435, "page_last": 10444, "abstract": "Prior work has investigated variations of prediction markets that preserve participants' (differential) privacy, which formed the basis of useful mechanisms for purchasing data for machine learning objectives.\n Such markets required potentially unlimited financial subsidy, however, making them impractical.\n In this work, we design an adaptively-growing prediction market with a bounded financial subsidy, while achieving privacy, incentives to produce accurate predictions, and precision in the sense that market prices are\n not heavily impacted by the added privacy-preserving noise.\n We briefly discuss how our mechanism can extend to the data-purchasing setting, and its relationship to traditional learning algorithms.", "full_text": "Bounded-Loss Private Prediction Markets\n\nRafael Frongillo\nColorado Boulder\nraf@colorado.edu\n\nBo Waggoner\n\nMicrosoft Research\nbwag@colorado.edu\n\nAbstract\n\nPrior work has investigated variations of prediction markets that preserve par-\nticipants\u2019 (differential) privacy, which formed the basis of useful mechanisms\nfor purchasing data for machine learning objectives. Such markets required po-\ntentially unlimited \ufb01nancial subsidy, however, making them impractical. In this\nwork, we design an adaptively-growing prediction market with a bounded \ufb01nancial\nsubsidy, while achieving privacy, incentives to produce accurate predictions, and\nprecision in the sense that market prices are not heavily impacted by the added\nprivacy-preserving noise. We brie\ufb02y discuss how our mechanism can extend to the\ndata-purchasing setting, and its relationship to traditional learning algorithms.\n\n1\n\nIntroduction\n\nIn a prediction market, a platform maintains a prediction (usually a probability distribution or\nan expectation) of a future random variable such as an election outcome. Participants\u2019 trades of\n\ufb01nancial securities tied to this event are translated into updates to the prediction. Prediction markets,\ndesigned to aggregate information from participants, have gained a substantial following in the\nmachine learning literature. One reason is the overlap in goals (predicting future outcomes) as well\nas techniques (convex analysis, Bregman divergences), even at a deep level: the form of market\nupdates in standard automated market makers have been shown to mimic standard online learning or\noptimization algorithms in many settings [2, 3, 8, 9]. Beyond this research-level bridge, recent papers\nhave suggested prediction market mechanisms as a way of crowdsourcing data or algorithms for\nmachine learning, usually by providing incentives for participants to repeatedly update a centralized\nhypothesis or prediction [4, 12].\nOne recently-proposed mechanism to purchase data or hypotheses from participants is that of\nWaggoner, et al. [12], in which participants submit updates to a centralized market maker, either by\ndirectly altering the hypothesis, or in the form of submitted data; both are interpreted as buying or\nselling shares in a market, paying off according to a set of holdout data that is revealed after the close\nof the market. The authors then show how to preserve differential privacy for participants, meaning\nthat the content of any individual update is protected, as well as natural accuracy and incentive\nguarantees.\nOne important drawback of Waggoner, et al. [12], however, is the lack of a bounded worst-case loss\nguarantee: as the number of participants grows, the possible \ufb01nancial liability of the mechanism\ngrows without bound. In fact, their mechanism cannot achieve a bounded worst-case loss without\ngiving up privacy guarantees. Subsequently, Cummings, et al. [7] show that all differentially-private\nprediction markets of the form proposed in [12] must suffer from unbounded \ufb01nancial loss in the\nworst case. Intuitively, one could interpret this negative result as saying that the randomness of the\nmechanism, which must be introduced to preserve privacy, also creates arbitrage opportunities for\nparticipants: by betting against the noise, they collectively expect to make an unbounded pro\ufb01t from\nthe market maker. Nevertheless, Cummings, et al. leave open the possibility that mechanisms outside\nthe mold of Waggoner, et al. could achieve both privacy and a bounded worst-case loss.\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\fIn this paper, we give such a mechanism: the \ufb01rst private prediction market framework with a bounded\nworst-case loss. When applied to the crowdsourcing problems stated above, this now allows the\nmechanism designer to maintain a \ufb01xed budget. Our construction and proof proceeds in two steps.\nWe \ufb01rst show that by adding a small transaction fee to the mechanism of [12], one can eliminate\n\ufb01nancial loss due to arbitrage while maintaining the other desirable properties of the market. The\nkey idea is that a carefully-chosen transaction fee can make each trader subsidize (in expectation)\nany arbitrage that may result from the noise preserving her privacy. Unless prices already match her\nbeliefs quite closely, however, she still expects to make a pro\ufb01t by paying the fee and participating.\nWe view this as a positive result both conceptually\u2014it shows that arbitrage opportunities are not an\ninsurmountable obstacle to private markets\u2014and technically\u2014the designer budget grows very slowly,\nonly O((log T )2), with the number of participants T .\nNonetheless, this \ufb01rst mechanism is still not completely satisfactory, as the budget is superconstant\nin T , and T must be known in advance. This dif\ufb01culty arises not from arbitrage, but (apparently) a\ndeeper constraint imposed by privacy that forces the market to be large relative to the participants.\nOur second and main result overcomes this \ufb01nal hurdle. We construct a sequence of adaptively-\ngrowing markets that are syntactically similar to the \u201cdoubling trick\u201d in online learning. The key\nidea is that, in the market from our \ufb01rst result, only about (log T )2 of the T participants can be\ninformational traders; after this point, additional participants do not cost the designer any more\nbudget, yet their transaction fees can raise signi\ufb01cant funds. So if the end of a stage is reached, the\nmarket activity has actually generated a surplus which subsidizes the initial portion of the next stage\nof the market.\n\n2 Setting\n\nIn a cost-function based prediction market, there is an observable future outcome Z taking values\nin a set Z. The goal is to predict the expectation of a random variable \u03c6 : Z \u2192 Rd. We assume \u03c6\nis a bounded random variable, as otherwise prediction markets with bounded \ufb01nancial loss are not\npossible. Participants will buy from the market contracts, each parameterized by a vector r \u2208 Rd. The\ncontract represents a promise for the market to pay the owner r \u00b7 \u03c6(Z) when Z is observed. Adopting\nstandard \ufb01nancial terminology, in our model there are d securities j = 1, . . . , d, and the owner of a\nshare in security j will receive a payoff of \u03c6(Z)j, that is, the jth component of the random variable.\nj=1 rj\u03c6(Z)j = r \u00b7 \u03c6(Z). Note\n\nThus a contract r \u2208 Rd contains rj shares of security j and pays off(cid:80)d\nto qt = qt\u22121 + dqt. In other words, qt =(cid:80)t\n\nthat rj < 0, or \u201cshort selling\u201d security j, is allowed.\nThe market maintains a market state qt \u2208 Rd at time t = 0, . . . , T , with q0 = 0. Each trader\nt = 1, . . . , T arrives sequentially and purchases a contract dqt \u2208 Rd, and the market state is updated\ns=1 dqs, the sum of all contracts purchased up to time t.\nThe price paid by each participant is determined by a convex cost function C : Rd \u2192 R. Intuitively,\nC maps qt to the total price paid by all agents so far, C(qt). Thus, participant t making trade dqt\nwhen the current state is qt\u22121 pays C(qt\u22121 + dqt) \u2212 C(qt\u22121). Notice that the instantaneous prices\npt = \u2207C(qt) represent the current price per unit of in\ufb01nitesimal purchases, with the jth coordinate\nrepresenting the current price per share of the jth security.\nThe prices \u2207C(q) are interpreted as predictions of E \u03c6(Z), as an agent who believes the jth coordinate\nis too low will purchase shares in it, raising its price, and so on. This can be formalized through a\nlearning lens: It is known [2] that agents in such a market maximize expected pro\ufb01t by minimizing\nan expected Bregman divergence between \u03c6(Z) and \u2207C(q); of course, it is known that \u2207C(q) =\nE \u03c6(Z) minimizes risk for any divergence-based loss [1, 6, 10]. (The Bregman divergence is that\ncorresponding to C\u2217, the convex conjugate of C.)\nPrice Sensitivity. The price sensitivity of a cost function C is a measure of how quickly prices\nrespond to trades, similar to \u201cliquidity\u201d discussed in Abernethy et al. [2, 5] and earlier works.\nFormally, the price sensitivity \u03bb of C is the supremum of the operator norm of the Hessian of C, with\nrespect to the (cid:96)1 norm.1 In other words, if c = (cid:107)q \u2212 q(cid:48)(cid:107)1 shares are purchased, then the change in\nprices (cid:107)\u2207C(q) \u2212 \u2207C(q(cid:48))(cid:107)1 is at most \u03bbc.\n\n1For convenience we will assume C is twice differentiable, though this is not necessary.\n\n2\n\n\f\u03bb is equivalent to scaling G by 1\n\nPrice sensitivity is directly related to the worst-case loss guarantee of the market, as follows. Those\nfamiliar with market scoring rules may recall that with scoring rule S, the loss can be bounded by\n(a constant times) the largest possible score. Hence, scaling S by a factor 1\n\u03bb immediately scales the\n\u03bb as well. Recall that S is de\ufb01ned by a convex function G, the convex conjugate\nloss bound by 1\nof C. Scaling S by 1\n\u03bb. By standard results in convex analysis,\nthis is equivalent to transforming C into C\u03bb(q) = 1\n\u03bb C (\u03bbq), an operation known as the perspective\ntransform. This in turn scales the price sensitivity by \u03bb by the properties of the Hessian.\nPrice sensitivity is also related to the total number of trades required to change the prices in a market.\nIf we assume each trade consists of at most one share in each security, then 1\n\u03bb trades are necessary to\nshift the predictions to an arbitrary point from an arbitrary point.\nConvention: normalized, scaled C. In the remainder of the paper, we will suppose that we start\nwith some convex cost function C1 whose price sensitivity equals 1 and worst-case loss bounded by\nsome constant B1. Then, to obtain price sensitivity \u03bb, we use the cost function C(\u00b7) = 1\n\u03bb C1(\u03bb\u00b7). As\ndiscussed above, C has price sensitivity at most \u03bb and a worst-case loss bound of B = B1/\u03bb. (This\nassumption is without loss of generality, as any cost function that guarantees a bounded worst-case\nloss can be scaled such that its price sensitivity is 1.)\n\n2.1 Prior work\n\nTo achieve differential privacy for trades of a bounded size (which will be assumed), the general\napproach is to add random noise to the \u201ctrue\u201d market state q and publish this noisy state \u02c6q. The privacy\nlevel thus determines how close \u02c6q is to q. The distance from \u2207C(\u02c6q) to \u2207C(q) is then controlled\nby the price sensitivity \u03bb. For a \ufb01xed noise and privacy level, a smaller \u03bb leads to small impact of\nnoise on prices, meaning very good accuracy. However, decreasing \u03bb does not come for free: the\nworst-case \ufb01nancial loss of to the market designer scales as 1/\u03bb.\nThe market of [12] adds controlled and correlated noise over time, in a manner similar to the\n\u201ccontinual observation\u201d technique of differential privacy. This reduces the in\ufb02uence of noise on\naccuracy to polylogarithmic in T , the number of participants. Their main result for the prediction\nmarket setting studied here is as follows.\nTheorem 1 ([12]). Assuming that all trades satisfy (cid:107)dqt(cid:107)1 \u2264 1, the private mechanism is \u0001-\ndifferentially private in the trades dq1, . . . , dqT with respect to the output \u02c6q1, . . . , \u02c6qT . Further,\nto satisfy (cid:107)pt \u2212 \u02c6pt(cid:107)1 \u2264 \u03b1 for all t, except with probability \u03b3, it suf\ufb01ces for the price sensitivity to be\n(1)\n\n\u03bb\u2217 =\n\n\u03b1 \u0001\n\n\u221a\n4\n\n2d(cid:100)log T(cid:101) ln(2T d/\u03b3)\n\n.\n\n2.2 Our setting and desiderata\n\nThis paper builds on the work of Waggoner et al. [12] to overcome the negative results of Cummings\net al. [7]. Here, we formalize our setting and four desirable properties we hope to achieve.\nWrite a prediction market mechanism as a function M taking inputs (cid:126)dq = dq1, . . . , dqT and\noutputting a sequence of market states \u02c6q1, . . . , \u02c6qT . Here \u02c6qt is thought of as a noisy version of\ns\u2264t dqs. Each of these states is associated with a prediction \u02c6pt in the set of possible prices\n\nqt =(cid:80)\n\n(expectations of \u03c6), while the state qt is associated with the \u201ctrue\u201d underlying prediction pt.\nDe\ufb01nition 1 (Privacy). M satis\ufb01es (\u0001, \u03b4)-differential privacy if for all pairs of inputs (cid:126)dq, (cid:126)dq(cid:48) differing\nby only a single participants\u2019 entry, and for all sets S of possible outputs, Pr[M ( (cid:126)dq) \u2208 S] \u2264\ne\u0001 Pr[M ( (cid:126)dq(cid:48)) \u2208 S] + \u03b4. If furthermore \u03b4 = 0, we say M is \u0001-differentially private.\nDe\ufb01nition 2 (Precision). M has (\u03b1, \u03b3) precision if for all (cid:126)dq, with probability 1 \u2212 \u03b3, we have\n(cid:107)\u02c6pt \u2212 pt(cid:107)1 \u2264 \u03b1 for all t.\nDe\ufb01nition 3 (Incentives). M has \u03b2-incentive to participate if, for all beliefs p = E \u03c6(Z), if at any\npoint (cid:107)\u02c6pt \u2212 p(cid:107)\u221e > \u03b2, then there exists a participation opportunity that makes a strictly positive pro\ufb01t\nin expectation with respect to p.\n\nFor the budget guarantee, we must formalize the notion that participants may respond to\nlet a trader\nthe noise introduced by the mechanism.\n\nFollowing Cummings et al. [7],\n\n3\n\n\fstrategy (cid:126)s = (s1, . . . , sT ) where each st\nis a possibly-randomized function of the form\nst(dq1, . . . , dqt\u22121; \u02c6q1, . . . , \u02c6qt\u22121) = dqt, i.e. a strategy taking the entire history prior to t and\noutputting a trade dqt. Let L(M, (cid:126)s, z) be a random variable denoting the \ufb01nancial loss of the market\nM against trader strategy (cid:126)s when Z = z, which for the mechanism described above is simply\n\nT(cid:88)\n\n(cid:2)C(\u02c6qt) \u2212 C(\u02c6qt + dqt) \u2212 dqt \u00b7 \u03c6(z)(cid:3) .\n\nL(M, (cid:126)s, z) =\n\nDe\ufb01nition 4. M guarantees designer budget B if, for any trader strategy (cid:126)s and all z, E L(M, (cid:126)s, z) \u2264\nB, where the expectation is over the randomness in M and each st.\n\nt=1\n\n3 Slowly-Growing Budget\n\nThe private market of Waggoner et al. [12] causes unbounded loss for the market maker in two ways.\nThe \ufb01rst is from traders betting against the random noise introduced to protect privacy. This is a key\nidea leveraged by Cummings et al. [7] to show negative results for private markets. In this section,\nwe show that a transaction fee can be chosen to exactly balance the expected pro\ufb01t from this type of\narbitrage.2 We will show that this fee is still small enough to allow for very accurate prices.3 This\ntransaction fee restores the worst-case loss guarantee to the inverse of the price sensitivity, just as in a\nnon-private market. The second way the market causes unbounded loss is to require price sensitivity\nto shrink as a function of T ; this is addressed in the next section.\nWe show that with this carefully-chosen fee, the market still achieves precision, incentive, and privacy\nguarantees, but now with a worst-case market maker loss of O((log T )2), much improved over the\nna\u00efve O(T ) bound. This is viewed as a positive result because the worst-case loss is growing quite\nslowly in the total number of participants, and moreover matches the fundamental \u201cinformational\u201d\nworst-case loss one expects with price sensitivity \u03bb\u2217.\n\n3.1 Mechanism and result\n\nHere we recall the private market mechanism of [12], adapted to the prediction market setting\nfollowing [7]. We will express the randomness of the mechanism in terms of a \u201cnoise trader\u201d for both\nintuition and technical convenience. The market is de\ufb01ned by a cost function C with price sensitivity\n\u03bb, and parameters c (transaction fee), \u0001 (privacy), \u03b1, \u03b3 (precision), and T (maximum number of\nparticipants). There is a special trader we call the noise trader who is controlled by the designer.\nAll actions of the noise trader are hidden and known only by the designer. The designer publishes\nan initial market state q0 = \u02c6q0 = 0. Let T (cid:48) denote the actual number of arrivals, with T (cid:48) \u2264 T by\nassumption. Then, for t = 1, . . . , T (cid:48):\n\n1. Participant t arrives, pays a fee of c, and purchases bundle dqt with (cid:107)dqt(cid:107)1 \u2264 1. The\n\npayment is C(\u02c6qt + dqt) \u2212 C(\u02c6qt).\n2. The noise trader purchases a randomly-chosen bundle zt, called a noise trade, after selling\noff some subset {zt1, . . . , ztk} of previously purchased noise trades for ti < t, according\ni=1 zti denote this net\nnoise bundle, the noise trader is thus charged C(\u02c6qt + dqt + wt) \u2212 C(\u02c6qt + dqt).\n\nto a predetermined schedule described below. Letting wt = zt \u2212(cid:80)k\n\n3. The \u201ctrue\u201d market state is updated to qt = qt\u22121 + dqt, but is not revealed.\n4. The noisy market state is updated to \u02c6qt = \u02c6qt\u22121 + dqt + wt and is published.\n\nFinally, z \u2208 Z is observed and each participant t receives a payment dqt \u00b7 \u03c6(z). For the sake of\nbudget analysis, we suppose that at the close of the market, the noise trader sells back all of her\n) \u2212 C(\u02c6qT (cid:48)\nremaining bundles; letting wT (cid:48)\n).\nNoise trades. Each zt is a d-dimensional vector with each coordinate drawn from an independent\nLaplace distribution with parameter b = 2(cid:100)log T(cid:101)/\u0001. To determine which bundles zs are sold at\ntime t, write t = 2jm where m is odd, and sell all bundles zs purchased during the previous\n\nbe the sum of these bundles, she is charged C(\u02c6qT (cid:48) \u2212 wT (cid:48)\n\n2Intuitively, it is enough for the fee to cover arbitrage amounts in expectation, because a trader must pay the\n\nfee to trade before the random noise is drawn and any arbitrage opportunity is revealed.\n\n3For instance, if the current price of a security is 0.49 and a trader believes the true price should be 0.50, she\n\nwill purchase a share if the fee is c < 0.01. (For privacy, we limit each trade to a \ufb01xed size, say, one share.)\n\n4\n\n\f2j\u22121 time steps which are not yet sold. Thus, the noise trader will sell bundles purchased at times\ns = t\u2212 1, t\u2212 2, t\u2212 4, t\u2212 8, . . . , t\u2212 2j\u22121; in particular, when t is odd we have j = 0, so no previous\nbundles will be sold.\nBudget. The total loss of the market designer can now be written as the sum of three terms: the loss\nof the market maker, the loss of the noise trader, and the gain from transaction fees. By convention,\nthe noise trader eventually sells back all bundles it purchases and is left with no shares remaining.\n\nnet loss of market maker\n\nnet loss of noise trader\n\n(cid:123)\n\n(cid:122)\nT (cid:48)(cid:88)\n\n(cid:125)(cid:124)\n\n(cid:123)\n\nfees(cid:122)(cid:125)(cid:124)(cid:123)\n\n(cid:122)\nT (cid:48)(cid:88)\n\n(cid:125)(cid:124)\n\nL(M, (cid:126)s, z) =\n\nC(\u02c6qt\u22121) \u2212 C(\u02c6qt\u22121+ dqt) + dqt\u00b7 \u03c6(z) +\n\nC(\u02c6qt\u22121+ dqt) \u2212 C(\u02c6qt) +\n\ncT (cid:48).\n\n(2)\n\nt=1\n\nt=1\n\nThe main result of this section is as follows.\nTheorem 2. When each arriving participant pays a transaction fee c = \u03b1, the private market with\nany \u03bb \u2264 \u03bb\u2217 from eq. (1) satis\ufb01es \u0001-differential privacy, (\u03b1, \u03b3)-precision, 2\u03b1-incentive to trade, and\nbudget bound B1\n\n\u03bb , where B1 is the budget bound of the underlying cost function C1.\n\n3.2 Proof ideas: privacy, precision, incentives\n\nThe differential privacy and precision claims follow directly from the prior results, as nothing has\nchanged to impact them. The incentive claim is not technically involved, but perhaps subtle: the\ntransaction fee should be high enough to eliminate expected pro\ufb01t from arbitrage, yet low enough to\nallow for pro\ufb01t from information. The key point is that the transaction fee is a constant, but the farther\nthe prices are from the trader\u2019s belief, the more money she expects to make from a constant-sized\ntrade. The transaction fee creates a ball of size 2\u03b1 around the current prices where, if one\u2019s belief lies\nin that ball, then participation is not pro\ufb01table.\nWe give most of the proof of the designer budget bound, with some claims deferred to the full version.\nLemma 1 (Budget bound). The transaction-fee private market with any price sensitivity \u03bb \u2264 \u03bb\u2217\n\u03bb .\nguarantees a designer budget bound of B1\n\nProof. Let c be the transaction fee; we will later take c = \u03b1. Then the worst-case loss from eq. (2) is\n\nW C(\u03bb, T (cid:48)) := W C0(\u03bb, T (cid:48)) + N T L(\u03bb, T (cid:48)) \u2212 T (cid:48)c ,\n\nwhere W C0(\u03bb, T (cid:48)) is the worst-case loss of a standard prediction market maker with parameter \u03bb\nand T (cid:48) participants, N T L(\u03bb, T (cid:48)) is the worst-case noise trader loss, and T (cid:48)c is the revenue from T (cid:48)\ntransaction fees of size c each.\nThe worst-case loss of a standard prediction market maker is well-known; see e.g. [2]. By our\nnormalization and de\ufb01nition of price sensitivity, we thus have W C0(\u03bb, T (cid:48)) \u2264 B1\n\u03bb .\nTo bound the noise trader loss N T L(\u03bb, T (cid:48)), we will consider each bundle zt purchased by the noise\ntrader. The idea is to bound the difference in price between the purchase and sale of zt. For analysis,\nwe suppose that at each t, the noise trader \ufb01rst sells any previous bundles (e.g. at t = 4, \ufb01rst selling\nz3 and then selling z2), and then purchases zt.\nNow let b(t) be the largest power of 2 that divides t. Let qt\nbuy and qt\nthe noise trader purchases zt and just after she sells zt, respectively.\n\nsell be the market state just before\n\nClaim 1. For each t, exactly b(t) traders arrive between the purchase and the sale of bundle zt;\nfurthermore, qt\n\nbuy is exactly equal to the sum of these participants\u2019 trades.\n\nsell \u2212 qt\n\nFor example, suppose t is odd. Then only one participant arrives between the purchase and sale of zt.\nFurthermore, zt is the last bundle purchased by the noise trader at time t and is the \ufb01rst sold at time\nt + 1, so the difference in market state is exactly zt plus that participant\u2019s trade.\n\nClaim 2. If the noise trader purchases and later sells zt, then her net loss in expectation over zt (but\nfor any trader behavior in response to zt), is at most \u03bbb(t)K where K = E(cid:107)zt(cid:107)2.\n\n5\n\n\fWe now sum over all bundles zt purchased by the noise trader, i.e. at time steps 1, . . . , T (cid:48). Recall\nthat the noise trader sells back every bundle zt she purchases. Thus, her total payoff is the sum\nover t of the difference in price at which she buys zt and price at which she sells it. For each\nj = 0, . . . , log T (cid:48) \u2212 1, there are 2j different steps t with b(t) = T (cid:48)/2j+1. The total loss is thus,\n\nN T L(\u03bb, T (cid:48)) \u2264 log T (cid:48)\u22121(cid:88)\n\nj=0\n\n2j T (cid:48)\n2j+1 \u03bbK =\n\nT (cid:48) log T (cid:48)\n\n2\n\n\u03bbK .\n\n(3)\n\nNote that if the noise trader has some noise bundles left over after the \ufb01nal participant, we suppose\nshe immediately sells all remaining bundles back to the market maker in reverse order of purchase.\nPutting eq. (3) together with the above bound on W C0 gives\n\nW C(\u03bb, T (cid:48)) \u2264 W C0(\u03bb, T (cid:48)) + T (cid:48) log T (cid:48)\u03bbK \u2212 T (cid:48)c \u2264 B1\n\u03bb\n\n(4)\nwhich is in turn at most B1/\u03bb if we choose \u03bb and the transaction fee c such that c \u2265 K log T \u03bb. In\nother words, we take \u03bb \u2264 c/K log T .\nFinally, we can bound K = E(cid:107)zt(cid:107)2 from Claim 2 as follows: for each t, the components of the\nd-dimensional vector zt are each independent Lap(b) variables with b = 2(cid:100)log T(cid:101)/\u0001. By concavity\n\u221a\u00b7, we have\nof\n\n+ T (cid:48) (K log T (cid:48)\u03bb \u2212 c) ,\n\n(cid:118)(cid:117)(cid:117)(cid:116) d(cid:88)\n\n(cid:115)(cid:88)\n\nE zt(i)2 =(cid:112)dVar(Lap(b)) =\n\n\u221a\n\n\u221a\n2db2 = 2\n\n2d\n\n(cid:100)log T(cid:101)\n\n\u0001\n\n.\n\nK = E\n\nzt(i)2 \u2264\n\ni=1\n\ni\n\nTherefore, it suf\ufb01ces to pick\n\n\u03bb \u2264\n\n\u221a\n2\n\nc \u0001\n\n2d(cid:100)log T(cid:101) log T\n\n.\n\nFor c = \u03b1, this is in fact accomplished by the private, accurate market choosing \u03bb \u2264 \u03bb\u2217 (Equation\n1).\n\n1\n\nLimitations of this result. Unfortunately, Theorem 2 does not completely solve our problem: the\nother way that privacy impacts the market\u2019s loss is by lowering the necessary price sensitivity to\n\u03bb\u2217 \u2248\n(log T )2 as mentioned above, leading to a worst-case loss growing with T . It does not seem\npossible to address this via a larger transaction fee without giving up incentive to participate: traders\nparticipate as long as their expected pro\ufb01t exceeds the fee, and collectively \u2126(1/\u03bb) of them can arrive\nmaking consistent trades all moving the prices in the same (correct) direction, so the total payout will\nstill be \u2126(1/\u03bb).\n\n4 Constant Budget via Adaptive Market Size\n\nIn this section, we achieve our original goal by constructing an adaptively-growing prediction market\nin which each stage, if completed, subsidizes the initial portion of the next.\nThe market design is the following, with each T (k) to be chosen later. We run the transaction-fee pri-\nvate market above with T = T (1), transaction fee \u03b1, and price sensitivity \u03bb(1) = \u03bb\u2217(T (1), \u03b1/2, \u03b3/2)\nfrom eq. (1). When (and if) T (1) participants have arrived, we create a new market whose initial\nstate is such that its prices match the \ufb01nal (noisy) prices of the previous one. We set T (2) and price\nsensitivity \u03bb(2) = \u03bb\u2217(T (2), \u03b1/4, \u03b3/4) for the new market. We repeat, halving \u03b1 and \u03b3 at each stage\nand increasing T in a manner to be speci\ufb01ed shortly, until no more participants arrive.\nTheorem 3. For any \u03b1, \u03b3, \u0001, the adaptive market satis\ufb01es \u0001-differential privacy, 2\u03b1-incentive to trade,\n(\u03b1, \u03b3)-accuracy, and a designer budget bound of\n\n(cid:18)\n\n\u221a\n\n\u03b1 \u0001\n\n(cid:19)2\n\n,\n\n\u221a\n\n\u03b3\u03b12\u0001\n\nB \u2264 B1\n\n72\n\n2d\n\nln\n\n4608B1\n\n2d2\n\nwhere B1 is the budget bound of the underlying unscaled cost function C1.\n\n6\n\n\fProof idea. We set T (1) = \u0398(cid:0) B1d ln(B1d/\u03b3\u03b1\u0001)2\nfrom the market is O(cid:0)(log T )2(cid:1).\n\n\u03b12 \u0001\n\n(cid:1), and T (k) = 4T (k\u22121) thereafter. The key will be\n\nthe following observation. The total \u201cinformational\u201d pro\ufb01t available to the traders (by correcting\nthe initial market prices) is bounded by O(1/\u03bb), so if each trader expects to pro\ufb01t more than the\ntransaction fee c, then only O(1/\u03bbc) traders can all arrive and simultaneously pro\ufb01t.\nIndeed, if all\nT participants arrive, then the total pro\ufb01t from transaction fees is \u0398(T ) while the worst-case loss\n\nWe can leverage this observation to achieve a bounded worst-case loss with an \u201cadaptive-liquidity\u201d\napproach, similar in spirit to Abernethy et al. [5] but more technically similar to the doubling trick in\nonline learning. Begin by setting \u03bb(1) on the order of 1/(log T (1))2 = \u0398(1), and run a private market\nfor T (1) participants. If fewer than T (1) participants show up, the worst-case loss is order 1/\u03bb(1), a\nconstant. If all T (1) participants arrive, then (for the right choice of constants) the market has actually\nturned a pro\ufb01t \u2126(T (1)) from the transaction fees. Now set up a private market for T (2) = 4T (1)\ntraders with \u03bb(2) on the order of 1/(log T (2))2. If fewer than T (2) participants arrive, the worst-case\nloss is order 1/\u03bb(2). However, we will have chosen T (2) such that this loss is smaller than the \u2126(T (1))\npro\ufb01t from the previous market. Hence, the total worst-case loss remains bounded by a constant.\nIf all T (2) participants arrive, then again this market has turned a pro\ufb01t, which can be used to\ncompletely offset the worst-case loss of the next market, and so on. Some complications arise, as to\nachieve (\u03b1, \u03b3)-precision, we must set \u03b1(1), \u03b3(1), \u03b1(2), \u03b3(2), . . . as a convergent series summing to \u03b1\nand \u03b3; and we must show that all of these scalings are possible in such a way that the transaction fees\ncover the cost of the next iteration. (An interesting direction for future work would be to replace the\niterative approach here with the continuous liquidity adaptation of [5].)\nMore speci\ufb01cally, we prove that the loss in any round k that is not completed (not all participants\narrive) is at most \u03b1\n2 T (k).\nOf course, only one round is not completed: the \ufb01nal round k. If k = 1, then the \ufb01nancial loss is\nbounded by 1\n\u03bb(1) , a constant depending only on \u03b1, \u03b3, \u0001. Otherwise, the total loss is the sum of the\nlosses across rounds, but the mechanism makes a pro\ufb01t in every round but k. Moreover, the loss in\n8 T (k\u22121), which is at most half of the pro\ufb01t in round k \u2212 1. So if k \u2265 2,\nround k is at most \u03b1\nthe mechanism actually turns a net pro\ufb01t.\nWhile this result may seem paradoxical, note that the basic phenomenon appears in a classical\n(non-private) prediction market with a transaction fee, although to our knowledge this observation\nhas not yet appeared in the literature. Speci\ufb01cally, a classical prediction market with budget bound\nB1, trades of size 1, and a small transaction fee \u03b1, will still have an \u03b1-incentive to participate, and the\nworst case loss will still be \u0398(B1); this loss, however, can be extracted by as few as \u0398(1) participants.\nAny additional participants must be in a sense disagreeing about the correct prices; their transaction\nfees go toward market maker pro\ufb01t, but they do not contribute further to worst-case loss.\n\n16 T (k); moreover, the pro\ufb01t in any round k that is completed is at least \u03b1\n\n2 T (k) = \u03b1\n\n5 Kernels, Buying Data, Online Learning\n\nWhile preserving privacy in prediction markets is well-motivated in the classical prediction market\nsetting, it is arguably even more important in a setting where machine-learning hypotheses are learned\nfrom private personal data. Waggoner et al. [12] develop mechanisms for such a setting based on\nprediction markets, and further show how to preserve differential privacy of the participants. Yet their\nmechanisms are not practical in the sense that the \ufb01nancial loss of the mechanism could grow without\nbound. In this section, we sketch how our bounded-\ufb01nancial-loss market can also be extended to this\nsetting. This yields a mechanism for purchasing data for machine learning that satis\ufb01es \u0001-differential\nprivacy, \u03b1-precision and incentive to participate, and bounded designer budget.\nTo develop a mechanism which could be said to \u201cpurchase data\u201d from participants, Waggoner et\nal. [12] extend the classical setting in two ways. The \ufb01rst is to make the market conditional, where\nwe let Z = X \u00d7 Y, and have independent markets Cx : Rd \u2192 R for each x. Trades in each market\ntake the form qx \u2208 Rd, which pay out qx \u00b7 \u03c6(y) upon outcome (x(cid:48), y) if x = x(cid:48), and zero if x (cid:54)= x(cid:48).\nImportantly, upon outcome (x, y), only the costs associated to trades in the Cx market are tallied.\nThe second is to change the bidding language using a kernel, a positive semide\ufb01nite function\nk : Z \u00d7 Z \u2192 R. Here we think of contracts as functions f : Z \u2192 R in the reproducing kernel\nHilbert space (RKHS) F given by k, with basis {fz(\u00b7) = k(z,\u00b7) : z \u2208 Z}. For example, we recover\n\n7\n\n\fthe conditional market setting with independent markets with the kernel k((x, y), (x(cid:48), y(cid:48))) = 1{x =\nx(cid:48)}\u03c6(y) \u00b7 \u03c6(y(cid:48)). The RKHS structure is natural here because a basis contract fz pays off at each z(cid:48)\naccording to the \u201ccovariance\u201d structure of the kernel, i.e. the payoff of contract fz when z(cid:48) occurs\nequals fz(z(cid:48)) = k(z, z(cid:48)). For example, when Y = {\u22121, 1} one recovers radial basis classi\ufb01cation\nusing k((x, y), (x(cid:48), y(cid:48))) = yy(cid:48)e\u2212(x\u2212x(cid:48))2.\nThese two modi\ufb01cations to classical prediction markets, given as Mechanism 2 in [12], have clear\nadvantages as a mechanism to \u201cbuy data\u201d. One may imagine that each agent, arriving at time\nt \u2208 {1, . . . , T}, holds a data point (xt, yt) \u2208 Z = X \u00d7 Y. A natural purchase for this agent would\nbe a basis contract f(xt,yt), as this corresponds to a payoff that is highest when the test data point\nactually equals (xt, yt) and decreases with distance as measured by the kernel structure.\nThe importance of privacy now becomes even more apparent, as the data point (xt, yt) could be\ninformation sensitive to trader t. Fortunately, we can extend our main results to this setting. To\ndemonstrate the idea, we give a sketch of the result and proof below.\nTheorem 4 (Informal). Let Z = X \u00d7 Y where X is a compact subset of a \ufb01nite-dimensional real\nvector space and Y is \ufb01nite, and let positive semide\ufb01nite kernel k : Z \u00d7 Z \u2192 R be given. For\nany choices of accuracy parameters \u03b1, \u03b3, privacy parameters \u0001, \u03b4, trade size \u2206, and query limit\nQ, the kernel adaptive market satis\ufb01es (\u0001, \u03b4)-differential privacy, (\u03b1, \u03b3)-precision, 2\u03b1-incentive to\nparticipate, and a bounded designer budget.\n\nProof Sketch. The precision property, i.e. that prices are approximately accurate despite privacy-\npreserving noise, follows from [12, Theorem 2], and the technique in Theorem 3 to combine the\naccuracy and privacy of multiple epochs. The incentive to trade property is essentially unchanged, as\na participants\u2019 pro\ufb01t is still the improvement in expected Bregman divergence, which exceeds the\ntransaction fee unless prices are already accurate. It thus remains only to show a bounded designer\nbudget, which is slightly more involved. Brie\ufb02y, Claim 1 goes through unchanged, and Claim 2\nholds as written where now C becomes Cx and zt becomes zt(x) = f t(x,\u00b7), i.e., the trade at time t\nrestricted to the Cx market alone.\nthe expression for the noise trader loss becomes N T L(\u03bb, T (cid:48)) = E(cid:2)supx\u2208X(cid:80)T (cid:48)\nThe remainder of Lemma 1 now proceeds with one modi\ufb01cation regarding the constant K. In eq. (3),\nt=1 \u03bb\u03b1t(cid:107)zt(x)(cid:107)2\nwhere the \u03b1t are simply coef\ufb01cients to keep track of how many trades occurred between the buy and\nsell of noice trade t. We can proceed as follows:\n(cid:21)\n\n(cid:3),\n\n(cid:20)\n\nT (cid:48)(cid:88)\n\n\uf8f9\uf8fb = \u03bb\n\nT (cid:48)(cid:88)\n\n\uf8ee\uf8f0\n\nT (cid:48)(cid:88)\n\nt=1\n\nN T L(\u03bb, T (cid:48)) \u2264 E\n\nsup\n\nx1,...,xT (cid:48)\u2208X\n\n\u03bb\u03b1t(cid:107)zt(xt)(cid:107)2\n\nt=1\n\nt=1\n\n\u03b1t E\n\nsup\nx\u2208X\n\n(cid:107)zt(x)(cid:107)2\n\n= \u03bb\n\n\u03b1tK ,\n\nwhere K is simply the constant E [supx\u2208X (cid:107)zt(x)(cid:107)2] where the expectation is taken over the Gaussian\nprocess generating the noise. It is well-known that the expected maximum of a Gaussian process is\nbounded [11], and thus boundedness of K follows from the fact that Y is \ufb01nite. Thus, continuing\nfrom eq. (3) we obtain N T L(\u03bb, T (cid:48)) \u2264 T (cid:48) log T (cid:48)\n\u03bbK as before, with this new K. Finally, the proof of\nTheorem 3 now goes through, as it only treats the mechanism from Theorem 2 as a black box.\n\n2\n\nspeci\ufb01cally, the price update at time t is given by pt = \u2207C(qt) = argmaxw\u2208\u2206(Y)(cid:104)w,(cid:80)\n\nWe close by noting the similarity between the kernel adaptive market mechanism and traditional\nlearning algorithms, as alluded to in the introduction. As observed by Abernethy, et al. [2], the market\nprice update rule for classical prediction markets resembles Follow-the-Regularized-Leader (FTRL);\ns\u2264t dqs(cid:105) \u2212\nR(w), where dqs is the trade at time s, and R = C\u2217 is the convex conjugate of C.\nIn our RKHS setting, we can see the same relationship. For concreteness, let Cx(q) = 1\n\u03bb C(\u03bbq)\nfor all x \u2208 X , and let R : \u2206(Y) \u2192 R be the conjugate of C. Suppose further that each agent t\npurchases a basis contract df t = fxt,yt, where we take a classi\ufb01cation kernel k(cid:48)((x, y), (x(cid:48), y(cid:48))) =\n\n8\n\n\fk(x, x(cid:48))1{y = y(cid:48)}. Letting dqt(x) = df t(x,\u00b7) \u2208 RY, the market price at time t is given by,\n\n(cid:29)\n\n\u2212 1\n\u03bb\n\nR(w)\n\n(cid:29)\n\n(cid:29)\n\nk((xs, ys), (x,\u00b7))\n\n\u2212 1\n\u03bb\n\nR(w)\n\nk(xs, x)1ys\n\n\u2212 1\n\u03bb\n\nR(w) ,\n\nx = argmax\npt\nw\u2208\u2206(Y)\n\nw,\n\ndqs(x)\n\n(cid:28)\n(cid:28)\n(cid:28)\n\ns\u2264t\n\n(cid:88)\n(cid:88)\n(cid:88)\n\ns\u2264t\n\ns\u2264t\n\n= argmax\nw\u2208\u2206(Y)\n\n= argmax\nw\u2208\u2206(Y)\n\nw,\n\nw,\n\nwhere 1y is an indicator vector. Thus, the market price update follows a natural kernel-weighted\nFTRL algorithm, where the learning rate \u03bb is the price sensitivity of the market.\n\n6 Summary and Future Directions\n\nMotivated by the problem of purchasing data, we gave the \ufb01rst bounded-budget prediction mar-\nket mechanism that achieves privacy, incentive alignment, and precision (low impact of privacy-\npreserving noise the predictions). To achieve bounded budget, we \ufb01rst introduced and analyzed a\ntransaction fee, achieving a slowly-growing O((log T )2) budget bound, thus eliminating the arbitrage\nopportunities underlying previous impossibility results. Then, observing that this budget still grows\nin the number of participants T , we further extended these ideas to design an adaptively-growing\nmarket, which does achieve bounded budget along with privacy, incentive, and precision guarantees.\nWe see several exciting directions for future work. An extension of Theorem 4 where Y need not be\n\ufb01nite should be possible via a suitable generalization of Claim 2. Another important direction is to\nestablish privacy for parameterized settings as introduced by Waggoner, et al. [12], where instead of\nkernels, market participants update the (\ufb01nite-dimensional) parameters directly as in linear regression.\nFinally, we would like a deeper understanding of the learning\u2013market connection in nonparametric\nkernel settings, which could lead to practical improvements for design and deployment.\n\nReferences\n[1] J. Abernethy and R. Frongillo. A characterization of scoring rules for linear properties. In Proceedings\nof the 25th Conference on Learning Theory, pages 1\u201327, 2012. URL http://jmlr.csail.mit.edu/\nproceedings/papers/v23/abernethy12/abernethy12.pdf.\n\n[2] Jacob Abernethy, Yiling Chen, and Jennifer Wortman Vaughan. Ef\ufb01cient market making via convex\noptimization, and a connection to online learning. ACM Transactions on Economics and Computation, 1\n(2):12, 2013. URL http://dl.acm.org/citation.cfm?id=2465777.\n\n[3] Jacob Abernethy, Sindhu Kutty, S\u00e9bastien Lahaie, and Rahul Sami. Information aggregation in exponential\nfamily markets. In Proceedings of the \ufb01fteenth ACM conference on Economics and computation, pages\n395\u2013412. ACM, 2014. URL http://dl.acm.org/citation.cfm?id=2602896.\n\n[4] Jacob D. Abernethy and Rafael M. Frongillo. A collaborative mechanism for crowdsourcing prediction\n\nproblems. In Advances in Neural Information Processing Systems 24, pages 2600\u20132608, 2011.\n\n[5] Jacob D. Abernethy, Rafael M. Frongillo, Xiaolong Li, and Jennifer Wortman Vaughan. A General Volume-\nparameterized Market Making Framework. In Proceedings of the Fifteenth ACM Conference on Economics\nand Computation, EC \u201914, pages 413\u2013430, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2565-3.\ndoi: 10.1145/2600057.2602900. URL http://doi.acm.org/10.1145/2600057.2602900.\n\n[6] A. Banerjee, X. Guo, and H. Wang. On the optimality of conditional expectation as a Bregman predictor.\nIEEE Transactions on Information Theory, 51(7):2664\u20132669, July 2005. ISSN 0018-9448. doi: 10.1109/\nTIT.2005.850145.\n\n[7] Rachel Cummings, David M Pennock, and Jennifer Wortman Vaughan. The possibilities and limitations of\nprivate prediction markets. In Proceedings of the 17th ACM Conference on Economics and Computation,\nEC \u201916, pages 143\u2013160. ACM, 2016.\n\n[8] R. Frongillo, N. Della Penna, and M. Reid. Interpreting prediction markets: a stochastic approach. In\nAdvances in Neural Information Processing Systems 25, pages 3275\u20133283, 2012. URL http://books.\nnips.cc/papers/files/nips25/NIPS2012_1510.pdf.\n\n9\n\n\f[9] Rafael Frongillo\n\nand Mark D. Reid.\n\nof Prediction Mar-\nInformation Pro-\nkets via Randomized Subspace Descent.\ncessing\nURL http://papers.nips.cc/paper/\n5727-convergence-analysis-of-prediction-markets-via-randomized-subspace-descent.\n\nConvergence Analysis\nIn Advances\n\n3016\u20133024,\n\nin Neural\n\nSystems,\n\npages\n\n2015.\n\n[10] L.J. Savage. Elicitation of personal probabilities and expectations. Journal of the American Statistical\n\nAssociation, pages 783\u2013801, 1971.\n\n[11] Michel Talagrand. Upper and lower bounds for stochastic processes: modern methods and classical\n\nproblems, volume 60. Springer Science & Business Media, 2014.\n\n[12] Bo Waggoner, Rafael Frongillo, and Jacob D Abernethy. A Market Framework for Eliciting Private\nData. In Advances in Neural Information Processing Systems 28, pages 3492\u20133500, 2015. URL http:\n//papers.nips.cc/paper/5995-a-market-framework-for-eliciting-private-data.pdf.\n\n10\n\n\f", "award": [], "sourceid": 6693, "authors": [{"given_name": "Rafael", "family_name": "Frongillo", "institution": "CU Boulder"}, {"given_name": "Bo", "family_name": "Waggoner", "institution": "Microsoft"}]}