{"title": "Locally Private Gaussian Estimation", "book": "Advances in Neural Information Processing Systems", "page_first": 2984, "page_last": 2993, "abstract": "We study a basic private estimation problem: each of n users draws a single i.i.d. sample from an unknown Gaussian distribution N(\\mu,\\sigma^2), and the goal is to estimate \\mu while guaranteeing local differential privacy for each user. As minimizing the number of rounds of interaction is important in the local setting, we provide adaptive two-round solutions and nonadaptive one-round solutions to this problem. We match these upper bounds with an information-theoretic lower bound showing that our accuracy guarantees are tight up to logarithmic factors for all sequentially interactive locally private protocols.", "full_text": "Locally Private Gaussian Estimation\n\nMatthew Joseph\u2217\n\nUniversity of Pennsylvania\nmajos@cis.upenn.edu\n\nJieming Mao \u2020\n\nGoogle Research New York\n\nmaojm@google.com\n\nJanardhan Kulkarni\n\nMicrosoft Research Redmond\n\njakul@microsoft.com\n\nZhiwei Steven Wu \u2021\nUniversity of Minnesota\n\nzsw@umn.edu\n\nAbstract\n\ni.i.d. sample from an unknown Gaussian distribution N(\u00b5, \u03c32), and the goal\n\nWe study a basic private estimation problem: each of n users draws a single\n\nis to estimate \u00b5 while guaranteeing local differential privacy for each user. As\nminimizing the number of rounds of interaction is important in the local setting,\nwe provide adaptive two-round solutions and nonadaptive one-round solutions to\nthis problem. We match these upper bounds with an information-theoretic lower\nbound showing that our accuracy guarantees are tight up to logarithmic factors for\nall sequentially interactive locally private protocols.\n\n1\n\nIntroduction\n\nDifferential privacy is a formal algorithmic guarantee that no single input has a large effect on\nthe output of a computation. Since its introduction [11], a rich line of work has made differential\nprivacy a compelling privacy guarantee (see Dwork et al. [12] and Vadhan [24] for surveys), and\ndeployments of differential privacy now exist at many organizations, including Apple [2], Google [5,\n13], Microsoft [8], Mozilla [3], and the US Census Bureau [1, 20].\nMuch recent attention, including almost all industrial deployments, has focused on a variant called\nlocal differential privacy [4, 11, 19]. In the local model private data is distributed across many\nusers, and each user privatizes their data before the data is collected by an analyst. Thus, as any\nlocally differentially private computation runs on already-privatized data, data contributors need not\nworry about compromised data analysts or insecure communication channels. In contrast, (global)\ndifferential privacy assumes that the data analyst has secure, trusted access to the unprivatized data.\nHowever, the stronger privacy guarantees of the local model come at a price. For many problems,\na locally private solution requires far more samples than a globally private solution [7, 10, 19, 23].\nHere, we study the basic problem of locally private Gaussian estimation: given n users each holding\n\nan i.i.d. draw from an unknown Gaussian distribution N(\u00b5, \u03c32), can an analyst accurately estimate\n\nthe mean \u00b5 while guaranteeing local differential privacy for each user?\nOn the technical front, locally private Gaussian estimation captures two general challenges in locally\nprivate learning. First, since data is drawn from a Gaussian, there is no a priori (worst-case) bound\non the scale of the observations. Naive applications of standard privatization methods like Laplace\nand Gaussian mechanisms must add noise proportional to the worst-case scale of the data and are\nthus infeasible. Second, protocols requiring many rounds of user-analyst interaction are dif\ufb01cult to\n\n\u2217A portion of this work was done while at Microsoft Research Redmond.\n\n\u2020This work done while at the Warren Center, University of Pennsylvania.\n\u2021A portion of this work was done while at Microsoft Research New York.\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fimplement in real-world systems and may incur much longer running times. Network latency as\nwell as server and user liveness constraints compound this dif\ufb01culty [22]. It is therefore desirable to\nlimit the number of rounds of interaction between users and the data analyst. Finally, besides being a\nfundamental learning problem, Gaussian estimation has several real-world applications (e.g. telemetry\ndata analysis [8]) where one may assume that users\u2019 behavior follows a Guassian distribution.\n\n1.1 Our Contributions\n\n\u03b5\n\nWe divide our solution to locally private Gaussian estimation into two cases. In the \ufb01rst case, \u03c3\n\nwe also get the following accuracy guarantees.\nTheorem 1.1 (Informal). When \u03c3 is known, and n is suf\ufb01ciently large, there exists two-round protocol\n\nis known to the analyst, and in the second case \u03c3 is unknown but bounded in known[\u03c3min, \u03c3max].\nFor each case, we provide an(\u03b5, 0)-locally private adaptive two-round protocol and nonadaptive\none-round protocol4. Our privacy guarantees are worst-case; however, when x1, . . . , xn\u223c N(\u00b5, \u03c32)\n\u0002 log(1~\u03b2)\n\u0003 with probability 1\u2212 \u03b2, and there exists one-round\noutputting \u02c6\u00b5 such that\u02c6\u00b5\u2212 \u00b5= O\u0003 \u03c3\n\u0003\nlog(1~\u03b2)\u0001\nprotocol outputting \u02c6\u00b5 such that\u02c6\u00b5\u2212 \u00b5= O\u0004 \u03c3\nTheorem 1.2 (Informal). When \u03c3 is unknown but bounded in known[\u03c3min, \u03c3max], and n is suf\ufb01-\n\u0002 log(1~\u03b2) log(n)\nciently large, there exists two-round protocol outputting \u02c6\u00b5 such that\u02c6\u00b5\u2212 \u00b5= O\u0003 \u03c3\n\u0003\nwith probability 1\u2212 \u03b2, and there exists one-round protocol outputting \u02c6\u00b5 such that \u02c6\u00b5\u2212 \u00b5 =\nO\u0003 \u03c3\n\n\u0004 with probability 1\u2212 \u03b2.\n\n\u0003 with probability 1\u2212 \u03b2.\n\nlog([\u03c3max~\u03c3min]+1) log(1~\u03b2) log3~2(n)\n\nlog(n)\n\n\u0002\n\nn\n\nn\n\nn\n\n\u03b5\n\n\u03b5\n\n\u03b5\n\nn\n\nAll of our protocols are sequentially interactive [10]: each user interacts with the protocol at most\nonce. We match these upper bounds with a lower bound showing that our results are tight for all\nsequentially interactive locally private protocols up to logarithmic factors. We obtain this result by\nintroducing tools from the strong data processing inequality literature [6, 21]. Using subsequent\nwork by Joseph et al. [16], we can also extend this lower bound to fully interactive protocols.\n\nTheorem 1.3 (Informal). For a given \u03c3, there does not exist an(\u03b5, \u03b4)-locally private protocolA\n\u0002 1\nsuch that for any \u00b5= O\u0003 \u03c3\nn\u0003, given x1, . . . , xn \u223c N(\u00b5, \u03c32),A outputs estimate \u02c6\u00b5 satisfying\n\u02c6\u00b5\u2212 \u00b5= o\u0003 \u03c3\n\n\u0002 1\nn\u0003 with probability\u2265 15~16.\n\n\u03b5\n\n\u03b5\n\n1.2 Related Work\n\nn\n\n\u0002 log(1~\u03b2)\n\n+ poly log(1~\u03b2)\n\naccuracy lower bound holds even without privacy, our upper and lower bounds show that local privacy\n\nSeveral works have already studied differentially private versions of various statistical tasks, especially\nin the global setting. Karwa and Vadhan [18] and Kamath et al. [17] consider similar versions of\nGaussian estimation under global differential privacy, respectively in the one-dimensional and high-\ndimensional cases. For both the known and unknown variance cases, Karwa and Vadhan [18] offer\n\n\u0003 accuracy upper bound for estimating \u00b5. Since an \u2126\u0003\u03c3\n\nan O\u0003\u03c3\nadds a roughly\u221an accuracy cost over global privacy.\nas large as \u2126(n), linear in the number of users. In contrast, we provide both adaptive and nonadaptive\nsolutions, and our protocols all have round complexity T\u2264 2. A full comparison appears in Figure 1.\n\nIn concurrent independent work, Gaboardi et al. [14] also study locally private Gaussian estimation.\nWe match or better their accuracy results with much lower round complexity. They provide adaptive\nprotocols for the known- and unknown-\u03c3 settings, with the latter protocol having round complexity T\n\n\u0002 log(1~\u03b2)\n\n\u0003\n\n\u03b5n\n\nn\n\nGaboardi et al. [14] also prove a tight lower bound for nonadaptive protocols that can be extended to\nsequentially interactive protocols. We provide a lower bound that is tight for sequentially interactive\nprotocols up to logarithmic factors, and we depart from previous local privacy lower bounds by\nintroducing tools from the strong data processing inequality (SDPI) literature [6, 21]. This approach\n\n4As \u201cadaptive\u201d and \u201cnonadaptive\u201d are implicit in \u201ctwo-round\u201d and \u201cone-round\u201d, we often omit these terms.\n\n2\n\n\fGaboardi et al. [14]\n\nAccuracy \u03b1, Round Complexity T\n\nAccuracy \u03b1, Round Complexity T\n\nlog(n)\n\n\u03b5\n\n\u03b5\n\n\u03b2\n\n\u03b2\n\nn\n\nn\n\n\u0003\n\n\u00ef\u00ef\u0017\n\nThis Work\n\n\u0004\n\u00ef\u00ef\u00ef \u03c3\nlog\u0003 1\n\u03b1= O\nT= 2\n\u0004\n\u00ef\u00ef\u00ef \u03c3\n\u0003\u0001\nlog\u0003 1\n\u03b1= O\nT= 1\n\u0004\n\u00ef\u00ef\u00ef \u03c3\nlog\u0003 1\n\u03b1= O\nT= 2\n\u0004\n\u00ef\u00ef\u00ef \u03c3\n+1\u0003 log\u0003 1\nlog\u0003 \u03c3max\nT= 1\n\n\u0003 log(n)\n\nn\n\n\u03b2\n\n\u03b5\n\n\u00ef\u00ef\u0017\n\u00ef\u00ef\u0017\n\n\u0004\n\n\u00ef\u00ef\u00ef \u03c3\n\n\u03b5\n\n\u03b1= O\n\n\u0003 log\u0003 1\n\n\u0003\n\n\u03b4\n\n\u00ef\u00ef\u0017\n\n\u03b2\n\n\u0003 log\u0003 n\nlog\u0003 1\nT= 2\n\nn\n\n\u03b2\n\nSetting\n\nKnown \u03c3,\nadaptive\n\nKnown \u03c3,\nnonadaptive\n\nUnknown \u03c3,\n\nadaptive\n\n\u03b1= O\n\n\u2013\n\nlog\u0003 1\n\n\u0004\n\u00ef\u00ef\u00ef \u03c3\n\u0003 log\u0003 n\n\u0003 log\u0003 1\nT= \u2126\u0002log\u0002 R\n\u03c3min\u0002\u0002\n\nn\n\n\u03b2\n\n\u03b2\n\n\u03b5\n\n\u03b4\n\n\u0003\n\n\u00ef\u00ef\u0017\n\n\u2013\n\nUnknown \u03c3,\nnonadaptive\n\n\u00ef\u00ef\u0017\n[14] use(\u03b5, \u03b4)-locally private algorithms and we use(\u03b5, 0). Here, R denotes an upper bound on both\n\u00b5 and \u03c3. In our setting, the upper bound on \u00b5 is O(2n\u03b52~ log(n~\u03b2)), leading the unknown variance\nprotocol of Gaboardi et al. [14] to round complexity potentially as large as \u02dc\u2126(n\u03b52~ log(1~\u03b2)).\n\nFigure 1: A comparison of upper bounds in Gaboardi et al. [14] and here. In all cases, Gaboardi et al.\n\n\u0003 log3~2(n)\n\n\u03b1= O\n\n\u03c3min\n\nn\n\n\u03b2\n\n\u03b5\n\nuses an SDPI to control how much information a sample gives about its generating distribution,\nthen uses existing local privacy results to bound the mutual information between a sample and the\nprivatized output from that sample. Subsequent work by Duchi and Rogers [9] generalizes the SDPI\nframework to prove lower bounds for a broader class of problems in local privacy. They also extend\nthe SDPI framework to prove lower bounds for fully interactive algorithms.\n\n2 Preliminaries\n\nWe consider a setting where, for each i\u2208[n], user i\u2019s datum is a single draw from an unknown\nGaussian distribution, xi\u223c N(\u00b5, \u03c32), and these draws are i.i.d. In our communication protocol, users\n\nmay exchange messages over public channels with a single (possibly untrusted) central analyst.5 The\nanalyst\u2019s task is to accurately estimate \u00b5 while guaranteeing local differential privacy for each user.\nTo minimize interaction with any single user, we restrict our attention to sequentially interactive\nprotocols. In these protocols, every user sends at most a single message in the entire protocol. We\nalso study the round complexity of these interactive protocols. Formally, one round of interaction in a\n\nprotocol consists of the following two steps: 1) the analyst selects a subset of users S\u2286[n], along\nwith a set of randomizers{Qi i\u2208 S}, and 2) each user i in S publishes a message yi= Qi(xi).\n\nA randomized algorithm is differentially private if arbitrarily changing a single input does not\nchange the output distribution \u201ctoo much\u201d. This preserves privacy because the output distribution is\ninsensitive to any change of a single user\u2019s data. We study a stronger privacy guarantee called local\ndifferential privacy. In the local model, each user i computes their message using a local randomizer.\nA local randomizer is a differentially private algorithm taking a single-element database as input.\n\nDe\ufb01nition 2.1 (Local Randomizer). A randomized function Qi\u2236 X\u2192 Y is an(\u03b5, \u03b4)-local randomizer\ni)\u2208 S]+ \u03b4.\nif, for every pair of observations xi, x\u2032\n\ni\u2208 X and any S\u2286 Y , Pr[Qi(xi)\u2208 S]\u2264 e\u03b5 Pr[Qi(x\u2032\n\n5The notion of a central coordinating analyst is only a useful simpli\ufb01cation. As the analyst has no special\n\npowers or privileges, any user, or the protocol itself, can be viewed as playing the same role.\n\n3\n\n\fA protocol is locally private if every user computes their message using a local randomizer. In a\nsequentially interactive protocol, the local randomizer for user i may be chosen adaptively based on\n\nprevious messages z1, . . . , zi\u22121. However, the choice of randomizer cannot be based on user i\u2019s data.\nDe\ufb01nition 2.2. A sequentially interactive protocolA is(\u03b5, \u03b4)-locally private for private user data\n{x1, . . . , xn} if, for every user i\u2208[n], the message Yi is computed using an(\u03b5, \u03b4)-local randomizer\nQi. When \u03b4> 0, we sayA is approximately locally private. If \u03b4= 0,A is purely locally private.\n\n3 Estimating \u00b5 with Known \u03c3\n\nWe begin with the case where \u03c32 is known (shorthanded \u201cKV\u201d). In Section 3.1, we provide a protocol\nKVGAUSSTIMATE that requires two rounds of analyst-user interaction. In Section 3.2, we provide a\nprotocol 1ROUNDKVGAUSSTIMATE achieving a weaker accuracy guarantee in a single round. All\nomitted pseudocode and proofs appear in the full version of this paper [15].\n\n3.1 Two-round Protocol KVGAUSSTIMATE\n\nn\n\n\u03b52\n\nAlgorithm 1 KVGAUSSTIMATE\n\nlog(n)= \u2126\u0002 log(\u00b5) log(1~\u03b2)\n\nIn KVGAUSSTIMATE the users are split into halves U1 and U2. In round one, the analyst queries\n\nin U2, who respond based on \u02c6\u00b51 and their own data. The analyst then aggregates this second set of\nresponses into a better \ufb01nal estimate of \u00b5.\n\nusers in U1 to obtain an O(\u03c3)-accurate estimate \u02c6\u00b51 of \u00b5. In round two, the analyst passes \u02c6\u00b51 to users\nTheorem 3.1. Two-round protocol KVGAUSSTIMATE satis\ufb01es(\u03b5, 0)-local differential privacy for\nx1, . . . , xn and, if x1, . . . , xn\u223ciid N(\u00b5, \u03c32) where \u03c3 is known and\n\u0002, with\n\u0002 log(1~\u03b2)\n\u0003.\nprobability 1\u2212 \u03b2 outputs \u02c6\u00b5 such that\u02c6\u00b5\u2212 \u00b5= O\u0003 \u03c3\nInput: \u03b5, k,L, n, \u03c3, U1, U2\n1: for j\u2208L do\nfor user i\u2208 U j\nUser i outputs \u02dcyi\u2190 RR1(\u03b5, i, j)\n6: Analyst computes \u02c6H1\u2190 KVAGG1(\u03b5, k,L, U1)\n7: Analyst computes \u02c6\u00b51\u2190 ESTMEAN(\u03b2, \u03b5, \u02c6H1, k,L)\n8: for user i\u2208 U2 do\nUser i outputs \u02dcyi\u2190 KVRR2(\u03b5, i, \u02c6\u00b51, \u03c3)\n11: Analyst computes \u02c6H2\u2190 KVAGG2(\u03b5, n~2, U2)\n12: Analyst computes \u02c6T\u2190\u221a\n2\u22c5 erf\u22121\u0002 2(\u2212 \u02c6H2(\u22121)+ \u02c6H2(1))\n13: Analyst outputs \u02c6\u00b52\u2190 \u03c3 \u02c6T+ \u02c6\u00b51\n\n\u0016 End of round 1\n\u0016 End of round 2\n\n2:\n3:\n4:\n5: end for\n\nend for\n\n9:\n10: end for\n\n1 do\n\n\u03b5\n\nn\n\n\u0002\n\nn\n\nOutput: Analyst estimate \u02c6\u00b52 of \u00b5\n\n3.1.1 First round of KVGAUSSTIMATE\n\nFor neatness, let L=\u00e6n~(2k)\u00e6, Lmin=\u00e6log(\u03c3)\u00e6, Lmax= Lmin\u2212 1+ L, andL={Lmin, Lmin+\n1, . . . , Lmax}. U1 is then split into L subgroups indexed byL, and each subgroup has size k =\n\u2126\u0002 log(n~\u03b2)\n\u0002. KVGAUSSTIMATE begins by iterating through each subgroup j\u2208L. Each user i\u2208 U j\nreleases a privatized version of\u00e6xi~2j\u00e6 mod 4 via randomized response (RR1): with probability\ne\u03b5~(e\u03b5+ 3), user i outputs\u00e6xi~2j\u00e6 mod 4, and otherwise outputs one of the remaining elements of\n{0, 1, 2, 3} uniformly at random. Responses from group U j\n\n1 will be used to estimate the jth least\nsigni\ufb01cant bit of \u00b5 (rounded to an integer). The analyst then uses KVAGG1 (\u201cKnown Variance\nAggregation\u201d) to aggregate and debias responses to account for this randomness.\n\n\u03b52\n\n1\n\n4\n\n\fAlgorithm 2 KVAGG1\n\nInput: \u03b5, k,L, U\n1: for j\u2208L do\n\nfor a\u2208{0, 1} do\nC j(a)\u2190{\u02dcyi i\u2208 U j, \u02dcyi= a}\ne\u03b5\u22121\u22c5\u0001C j(a)\u2212 k\ne\u03b5+3\u0001\n\u02c6H j(a)\u2190 e\u03b5+3\n\n2:\n3:\n4:\n5:\n6: end for\n7: Output \u02c6H\nOutput: Aggregated histogram \u02c6H of private user responses\n\nend for\n\n1\n\n1\n\nhistogram \u02c6H j\nsearch range for \u00b5. For example, if \u02c6H Lmax\n\n1 where most elements fall concentrate in a single bin. The analyst in turn narrows their\n\nThe result is a collection of histograms \u02c6H1. The analyst uses \u02c6H1 in ESTMEAN to binary search for \u00b5.\nIntuitively, for each subgroup U j\n1 if all multiples of 2j are far from \u00b5 then Gaussian concentration\n\n1 compute the same value of\u00e6x~2j\u00e6 mod 4. This produces a\nimplies that almost all users i\u2208 U j\nconcentrates in 0, then the range narrows to \u00b5\u2208[0, 2Lmax);\nif \u02c6H Lmax\u22121\nThis is also useful: a point from the \u201cmiddle\u201d of this block of bins is O(\u03c3)-close to \u00b5. The analyst\n\nconcentrates in 1, then the range narrows to \u00b5\u2208[2Lmax\u22121, 2Lmax), and so on.\n\nIf instead some multiple of 2j is near \u00b5, the elements of \u02c6H j\n\n1 will spread over multiple (adjacent) bins.\n\nAlgorithm 3 ESTMEAN\n\n, \u02c6H Lmax\u22121\n\nthus takes such a point as \u02c6\u00b51 and ends their search. Our analysis will also rely on having a noticeably\nlow-count bin that is non-adjacent to the bin containing \u00b5. This motivates using 4 as a modulus.\nIn this way, the analyst examines \u02c6H Lmax\n, . . . in sequence, estimating \u00b5 from most to least\nsigni\ufb01cant bit. Crucially, the modulus structure of user responses enables the analyst to carry out this\nbinary search with one round of interaction. Thus at the end of the \ufb01rst round the analyst obtains an\n\nO(\u03c3)-accurate estimate \u02c6\u00b51 of \u00b5.\nInput: \u03b2, \u03b5, \u02c6H1, k,L\n2\u0002\u22c5\u0001\n1: \u03c8\u2190\u0002 \u03b5+4\nk ln(8L~\u03b2)\n\u221a\n2: j\u2190 Lmax\n3: Ij\u2190[0, 2Lmax]\n4: while j\u2265 Lmin and maxa\u2208{0,1,2,3} \u02c6H j\nAnalyst computes integer c such that c2j\u2208 Ij and c\u2261 M1(j) mod 4\nAnalyst computes Ij\u22121\u2190[c2j,(c+ 1)2j]\nj\u2190 j\u2212 1\n9: j\u2190 max(j, Lmin)\n10: Analyst computes M1(j)\u2190 arg maxa\u2208{0,1,2,3} \u02c6H j\n1(a)\n11: Analyst computes M2(j)\u2190 arg maxa\u2208{0,1,2,3}\u2212{M1(j)} \u02c6H j\n1(a)\n12: Analyst computes c\u2217\u2190 maximum integer such that c\u22172j\u2208 Ij and c\u2217\u2261 M1(j) or M2(j) mod 4\n13: Analyst outputs \u02c6\u00b51\u2190 c\u22172j\n\n1(a)\u2265 0.52k+ \u03c8 do\n\n5:\n6:\n7:\n8: end while\n\nOutput: Initial estimate \u02c6\u00b51 of \u00b5\n\n1\n\n1\n\n\u03b5\n\n3.1.2 Second round of KVGAUSSTIMATE\n\nIn the second round, the analyst passes \u02c6\u00b51 to users in U2. Users respond through KVRR2 (\u201cKnown\nVariance Randomized Response\u201d), a privatized version of an algorithm from the distributed statistical\nestimation literature [6]. In KVRR2, each user centers their point with \u02c6\u00b51, standardizes it using\n\n\u03c3, and randomized responds on sgn((xi\u2212 \u02c6\u00b51)~\u03c3). This crucially relies on the \ufb01rst estimate \u02c6\u00b51, as\nproperly centering requires an initial O(\u03c3)-accurate estimate of \u02c6\u00b5. The analyst can then aggregate\n\nthese responses by a debiasing process KVAGG2 akin to KVAGG1.\n\n5\n\n\fAlgorithm 4 KVRR2\nInput: \u03b5, i, \u02c6\u00b51, \u03c3\n\ni\u2190(xi\u2212 \u02c6\u00b51)~\u03c3\n1: User i computes x\u2032\n2: User i computes yi\u2190 sgn(x\u2032\ni)\n3: User i computes c\u223cU[0, 1]\n4: if c\u2264 e\u03b5\nUser i publishes \u02dcyi\u2190 yi\ne\u03b5+1 then\nUser i publishes \u02dcyi\u2190\u2212yi\n\n5:\n6: else\n7:\n8: end if\nOutput: Private centered user estimate \u02dcyi\n\nAlgorithm 5 KVAGG2\nInput: \u03b5, k, U\n\n1: for a\u2208{\u22121, 1} do\nC(a)\u2190{\u02dcyi i\u2208 U, \u02dcyi= a}\ne\u03b5\u22121\u22c5\u0001C(a)\u2212 k\ne\u03b5+1\u0001\n\u02c6H(a)\u2190 e\u03b5+1\n\n2:\n3:\n4: end for\n5: Analyst outputs \u02c6H\nOutput: Aggregated histogram \u02c6H of private user responses\n\n\u02c6\u00b51< \u00b5, responses will skew toward 1, and if \u02c6\u00b51> \u00b5 responses will skew toward\u22121. By comparing\n\nFrom this aggregation \u02c6H2, the analyst obtains a good estimate of the bias of the initial estimate \u02c6\u00b51. If\n\nthis skew to the true standard CDF using the error function erf, the analyst recovers a better \ufb01nal\nestimate \u02c6\u00b52 of \u00b5 (Lines 12-13 of KVGAUSSTIMATE). Privacy of KVGAUSSTIMATE follows from\nthe privacy of the randomized response mechanisms RR1 and KVRR2.\n\n3.2 One-round Protocol 1ROUNDKVGAUSSTIMATE\n\nn\n\n\u03b52\n\nU j\n\n\u03b5\n\nn\n\nRecall that in KVGAUSSTIMATE the analyst 1) employs user pool U1 to compute rough estimate \u02c6\u00b51\nand 2) adaptively re\ufb01nes this estimate using responses from the second user pool U2. 1ROUNDKV-\nGAUSSTIMATE executes these two rounds of KVGAUSSTIMATE simultaneously by parallelization.\n\nfrom KVGAUSSTIMATE with different values of \u02c6\u00b51. Intuitively, it suf\ufb01ces that at least one subgroup\ncenters using a \u02c6\u00b51 near \u00b5: the analyst can then use the data from that subgroup and discard the rest.\n\nTheorem 3.2. One-round protocol 1ROUNDKVGAUSSTIMATE satis\ufb01es (\u03b5, 0)-local differen-\ntial privacy for x1, . . . , xn and, if x1, . . . , xn \u223ciid N(\u00b5, \u03c32) where \u03c3 is known and\nlog(n) =\n\u0002, with probability 1\u2212 \u03b2 outputs \u02c6\u00b5 such that\n\u2126\u0002 log(\u00b5) log(1~\u03b2)\n\u0003\nlog(1~\u03b2)\u0001\n\u02c6\u00b5\u2212 \u00b5= O\u0004 \u03c3\nlog(n)\n1ROUNDKVGAUSSTIMATE splits U2 into \u0398(\u0001\nlog(n)) subgroups that run the second-round protocol\nBy Gaussian concentration, most user samples cluster within O(\u03c3\nlog(n)) of \u00b5, so each subgroup\n\u0001\nlog(n)) apart on the real line, and each user\n2 receives a set of points S(j) interspersed \u0398(\u03c3\n2 centers using the point in S(j) closest to xi. This leads us to use \u0398(\u0001\ni\u2208 U j\nlog(n)) groups with\neach point in S(j+ 1) shifted \u0398(\u03c3) from the corresponding point in S(j). By doing so, we ensure\nthat some subgroup has most of its users center using a point within O(\u03c3) of \u00b5.\nS(j) for each j, the analyst then selects the subgroup U j\nS(j\u2217) closest to \u02c6\u00b51. This mimics the effect of adaptively passing \u02c6\u00b51 to the users in U j\nlog(n)) users, the cost is a log1~4(n) factor in accuracy.\n\nIn summary, 1ROUNDKVGAUSSTIMATE works as follows: after collecting the single round of\n\u2217\nresponses from U1 and U2, the analyst computes \u02c6\u00b51 using responses from U1. By comparing \u02c6\u00b51 and\n\u2217\n2 where most users centered using a value in\n\u2217\n2 , so the analyst\n2 as it processed responses from U2 in KVGAUSSTIMATE.\n\nsimply processes the responses from U j\nBecause U j\n\n\u0004 .\n\u0001\n\n2 contains \u0398(n~\u0001\n\n\u2217\n\n6\n\n\f4 Unknown Variance\n\nthat lies in known interval[\u03c3min, \u03c3max]. We again provide a two-round protocol UVGAUSSTI-\n\nIn this section, we consider the more general problem with unknown variance \u03c32 (shorthanded \u201cUV\u201d)\n\nMATE and a slightly less accurate one-round protocol 1ROUNDUVGAUSSTIMATE.\n\n4.1 Two-round Protocol\n\nUVGAUSSTIMATE is structurally similar to KVGAUSSTIMATE. In round one, the analyst uses the\nresponses of half of the users to roughly estimate \u00b5, and in round two the analyst passes this estimate\nto the second half of users for improvement. However, two key differences now arise. First, since\n\u03c3 is unknown, the analyst must now also estimate \u03c3 in round one. Second, since the analyst does\nnot have a very accurate estimate of \u03c3, the re\ufb01nement process of the second round employs Laplace\nnoise rather than the CDF comparison used in KVGAUSSTIMATE.\n\nTheorem 4.1. Two-round protocol UVGAUSSTIMATE satis\ufb01es(\u03b5, 0)-local differential privacy for\nx1, . . . , xn and, if x1, . . . , xn\u223ciid N(\u00b5, \u03c32) where \u03c3 is unknown but bounded in known[\u03c3min, \u03c3max]\nlog(n)= \u2126\u0004\u0002log\u0002 \u03c3max\n\u0004, with probability at least 1\u2212 \u03b2 outputs \u02c6\u00b5 such that\n\u0002\n\n\u03b2\u0002\n+1\u0002+log(\u00b5)\u0002 log\u0002 1\n\u02c6\u00b5\u2212 \u00b5= O\u0003 \u03c3\n\nlog(1~\u03b2) log(n)\n\n\u0003 .\n\nand\n\n\u03c3min\n\n\u03b52\n\nn\n\n\u03b5\n\nn\n\n1 do\n\nend for\n\nAlgorithm 6 UVGAUSSTIMATE\n\n2:\n3:\n4:\n5: end for\n\nInput: \u03b5, k1,L1, n, \u03c3, U1, U2\n1: for j\u2208L1 do\nfor user i\u2208 U j\nUser i outputs \u02dcyi\u2190 RR1(\u03b5, i, j)\n6: Analyst computes \u02c6H1\u2190 AGG1(\u03b5,L1, U1)\n7: Analyst computes \u02c6\u03c3\u2190 ESTVAR(\u03b2, \u03b5, \u02c6H1, k1,L1)\n8: Analyst computes \u02c6H2\u2190 KVAGG1(\u03b5, k1,L1, U1)\n9: Analyst computes \u02c6\u00b51\u2190 ESTMEAN(\u03b2, \u03b5, \u02c6H2, k1,L1)\n10: Analyst computes I\u2190[\u02c6\u00b51\u00b1 \u02c6\u03c3(2+\u0001\nln(4n))]\n11: for user i\u2208 U2 do\nUser i outputs \u02dcyi\u2190 UVRR2(\u03b5, i, I)\n14: Analyst outputs \u02c6\u00b52\u2190 2\nn\u2211i\u2208U2 \u02dcyi\n\n12:\n13: end for\n\nOutput: Analyst estimate \u02c6\u00b52 of \u00b5\n\n\u0016 End of round 1\n\n\u0016 End of round 2\n\n4.1.1 First round of UVGAUSSTIMATE\n\nAlso as in KVGAUSSTIMATE, each user i in each subgroup U j\n\nSimilarly to KVGAUSSTIMATE, we split U1 into L1=\u00e6n~(2k1)\u00e6 subgroups of size k1= \u2126\u0002 log(n~\u03b2)\n\u0002\nand de\ufb01ne Lmin =\u00e6log(\u03c3min)\u00e6, Lmax = L1+ Lmin\u2212 1\u2265\u0904log(\u03c3max)\u0905, andL1 ={Lmin, Lmin+\n1, . . . , Lmax}, indexing U1 byL1.\nof \u00e6xi~2j\u00e6 mod 4. The analyst aggregates them (KVAGG1) into \u02c6H2 and roughly estimates \u00b5\ntion (AGG1) into \u02c6H1 for estimating \u03c3 (ESTVAR). At a high level, because samples from N(\u00b5, \u03c32)\nprobably fall within 3\u03c3 of \u00b5, when 2j\u00e2 \u03c3 there exist a, a+ 1 mod 4\u2208{0, 1, 2, 3} such that almost\nall users i have\u00e6xi~2j\u00e6 mod 4\u2208{a, a+ 1}. The analyst\u2019s debiased aggregated histogram \u02c6H j\nconcentrates in at most two adjacent bins when 2j\u00e2 \u03c3 and spreads over more bins when 2j\u00e2 \u03c3.\n, . . . yields a rough estimate of when 2j\u00e2 \u03c3 versus when 2j\u00e2 \u03c3. As a result, at\nthe end of round one the analyst obtains O(\u03c3)-accurate estimates \u02c6\u03c3 of \u03c3 and \u02c6\u00b51 of \u00b5.\n\nBy a process like ESTMEAN, examining this transition from concentrated to unconcentrated in\n\u02c6H Lmax\n\n(ESTMEAN) as in KVGAUSSTIMATE. However, the analyst also employs a (similar) aggrega-\n\n1 publishes a privatized version\n\n, \u02c6H Lmax\u22121\n\n1 thus\n\n\u03b52\n\n1\n\n1\n\n7\n\n\f4.1.2 Second round of UVGAUSSTIMATE\n\nThe analyst now re\ufb01nes their initial estimate of \u00b5. First, the analyst constructs an interval I of size\n\n\u0001\nO(\u02c6\u03c3\nlog(n)) around \u02c6\u00b51. Users in U2 then truncate their values to I, add Laplace noise scaled to\nI (the sensitivity of releasing a truncated point), and send the result to the analyst using UVRR2.\n\nenough to cover most users (who would otherwise truncate too much and skew the responses) and\n\nThe analyst then simply takes the mean of these responses as the \ufb01nal estimate of \u00b5. Its accuracy\nguarantee follows from concentration of user samples around \u00b5 and Laplace noise around 0. Privacy\nfollows from our use of randomized response and Laplace noise.\nWe brie\ufb02y explain our use of Laplace noise rather than CDF comparison. Roughly, when using an\nestimate \u02c6\u03c3 in the centering process, error in \u02c6\u03c3 propagates to error in the \ufb01nal estimate \u02c6\u00b52. This leads\nus to Laplace noise, which better handles the error in \u02c6\u03c3 that estimation of \u03c3 introduces. The cost is the\n\n\u0001\nlog(n) factor that arises from adding Laplace noise scaled toI. Our choice ofI \u2014 constructed\nto contain not only \u00b5 but the points of \u2126(n) users \u2014 thus strikes a deliberate balance. I is both large\nsmall enough to not introduce much noise from privacy (as noise is scaled to Lap(I~\u03b5)).\nTheorem 4.2. One-round protocol 1ROUNDUVGAUSSTIMATE satis\ufb01es(\u03b5, 0)-local differential\nprivacy for x1, . . . , xn and, if x1, . . . , xn\u223ciid N(\u00b5, \u03c32) where \u03c3 is unknown but bounded in known\n[\u03c3min, \u03c3max] and\n\u0004, with probability at least 1\u2212 \u03b2 outputs \u02c6\u00b5\n\nWe now provide a one-round version of UVGAUSSTIMATE, 1ROUNDUVGAUSSTIMATE.\n\n4.2 One-round Protocol\n\nn\n\nwith\n\n\u03c3min\n\nlog(n)= \u2126\u0004\u0002log\u0002 \u03c3max\n\u00ef\u00ef\u00ef \u03c3\n\u02c6\u00b5\u2212 \u00b5= O\n\n\u03b5\n\n+1\u0002+log(\u00b5)\u0002 log\u0002 1\n\u03b2\u0002\n\u0004\n+1\u0003 log(1~\u03b2) log3~2(n)\n\nlog\u0003 \u03c3max\n\n\u03b52\n\n\u03c3min\n\n\u00ef\u00ef\u0017 .\n\nn\n\nLike 1ROUNDKVGAUSSTIMATE, 1ROUNDUVGAUSSTIMATE simulates the second round of\nUVGAUSSTIMATE simultaneously with its \ufb01rst round. 1ROUNDUVGAUSSTIMATE splits U2 into\nsubgroups, where each subgroup responds using a different interval Ij. At the end of the single\nround the analyst obtains estimates \u02c6\u00b51 and \u02c6\u03c3 from users in U1, constructs an interval I from these\nestimates, and \ufb01nds a subgroup of U2 where most users employed a similar interval Ij. This similarity\nguarantees that the subgroup\u2019s responses yield the same accuracy as the two-round case up to an\n\nthe modulus trick to minimize the number of subgroups. However, this time we parallelize not only\nover possible values of \u02c6\u00b51 but possible values of \u02c6\u03c3 as well. As this parallelization is somewhat\ninvolved, we defer its presentation to the full version of this paper [15].\n\nO(# subgroups) factor. As in 1ROUNDKVGAUSSTIMATE, we rely on Gaussian concentration and\nIn summary, at the end of the round the analyst computes \u02c6\u00b51 and \u02c6\u03c3, computes the resulting interval I\u2217,\nand identi\ufb01es a subgroup of U2 that responded using an interval Ij similar to I\u2217. This mimics the effect\nof passing an interval of size O(\u03c3\nlog(n)) around \u02c6\u00b51 to this subgroup and using the truncate-then-\nLaplace noise method of UVGAUSSTIMATE. The cost, due to the g= O\u0002\u0002log\u0002 \u03c3max\nlog(n)\u0002\nsubgroups required, is the 1~\u221ag reduction in accuracy shown in Theorem 4.2.\n\n\u03c3min\u0002+ 1\u0002\u0001\n\n\u0001\n\n5 Lower Bound\n\nWe now show that all of our upper bounds are tight up to logarithmic factors. Our argument has three\nsteps: we \ufb01rst reduce our estimation problem to a testing problem, then reduce this testing problem\nto a purely locally private testing problem, and \ufb01nally prove a lower bound for this purely locally\nprivate testing problem. Taken together, these results show that estimation is hard for sequentially\n\ninteractive(\u03b5, \u03b4)-locally private protocols. An extension to fully interactive protocols using recent\nTheorem 5.1. Let \u03b4< min\u0002\n16n ln(n~\u03b2)e7\u03b5\u0002 and \u03b5> 0. There exists absolute constant\nc such that ifA is an(\u03b5, \u03b4)-locally private(\u03b1, \u03b2)-estimator for Estimate(n, M, \u03c3) where M =\n\u03c3~[4(e\u03b5\u2212 1)\u221a\n\n2nc] and \u03b2< 1~16, then \u03b1\u2265 M~2= \u2126\u0003 \u03c3\n\nsubsequent work by Joseph et al. [16] appears in the full version of this paper [15].\n\n60n ln(5n~2\u03b2) ,\n\n\u0002 1\nn\u0003.\n\n\u0001\u03b2\n\n\u03b2\n\n\u03b5\n\n8\n\n\fReferences\n[1] John M. Abowd. The challenge of scienti\ufb01c reproducibility and privacy protection for statistical\n\nagencies. Technical report, Census Scienti\ufb01c Advisory Committee, 2016.\n\n[2] Differential Privacy Team Apple. Learning with privacy at scale. Technical report, Apple, 2017.\n\n[3] Brendan Avent, Aleksandra Korolova, David Zeber, Torgeir Hovden, and Benjamin Livshits.\nBlender: enabling local search with a hybrid differential privacy model. In USENIX Security\nSymposium, 2017.\n\n[4] Amos Beimel, Kobbi Nissim, and Eran Omri. Distributed private data analysis: Simultaneously\n\nsolving how and what. In International Cryptology Conference (CRYPTO), 2008.\n\n[5] Andrea Bittau, \u00dalfar Erlingsson, Petros Maniatis, Ilya Mironov, Ananth Raghunathan, David\nLie, Mitch Rudominer, Ushasree Kode, Julien Tinnes, and Bernhard Seefeld. Prochlo: Strong\nprivacy for analytics in the crowd. In Symposium on Operating Systems Principles (SOSP),\n2017.\n\n[6] Mark Braverman, Ankit Garg, Tengyu Ma, Huy L Nguyen, and David P Woodruff. Communica-\ntion lower bounds for statistical estimation problems via a distributed data processing inequality.\nIn Symposium on the Theory of Computing (STOC), 2016.\n\n[7] Amit Daniely and Vitaly Feldman. Learning without interaction requires separation. In Neural\n\nInformation and Processing Systems (NeurIPS), 2019.\n\n[8] Bolin Ding, Janardhan Kulkarni, and Sergey Yekhanin. Collecting telemetry data privately. In\n\nNeural Information Processing Systems (NIPS), 2017.\n\n[9] John Duchi and Ryan Rogers. Lower bounds for locally private estimation via communication\n\ncomplexity. In Conference on Learning Theory (COLT), 2019.\n\n[10] John C Duchi, Michael I Jordan, and Martin J Wainwright. Local privacy and statistical minimax\n\nrates. In Foundations of Computer Science (FOCS), 2013.\n\n[11] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to\n\nsensitivity in private data analysis. In Theory of Cryptography Conference (TCC), 2006.\n\n[12] Cynthia Dwork, Aaron Roth, et al. The algorithmic foundations of differential privacy. Founda-\n\ntions and Trends\u00ae in Theoretical Computer Science, 2014.\n\n[13] \u00dalfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. Rappor: Randomized aggregatable\nprivacy-preserving ordinal response. In Conference on Computer and Communications Security\n(CCS), 2014.\n\n[14] Marco Gaboardi, Ryan Rogers, and Or Sheffet. Locally private mean estimation: Z-test and\ntight con\ufb01dence intervals. In International Conference on Arti\ufb01cial Intelligence and Statistics\n(AISTATS), 2019.\n\n[15] Matthew Joseph, Janardhan Kulkarni, Jieming Mao, and Zhiwei Steven Wu. Locally private\n\ngaussian estimation. arXiv preprint arxiv:1811.08382, 2019.\n\n[16] Matthew Joseph, Jieming Mao, Seth Neel, and Aaron Roth. The role of interactivity in local\n\ndifferential privacy. In Foundations of Computer Science (FOCS), 2019.\n\n[17] Gautam Kamath, Jerry Li, Vikrant Singhal, and Jonathan Ullman. Privately learning high-\n\ndimensional distributions. In Conference on Learning Theory (COLT), 2019.\n\n[18] Vishesh Karwa and Salil Vadhan. Finite Sample Differentially Private Con\ufb01dence Intervals. In\n\nInnovations in Theoretical Computer Science Conference (ITCS), 2018.\n\n[19] Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam\n\nSmith. What can we learn privately? SIAM Journal on Computing, 2011.\n\n9\n\n\f[20] Yu-Hsuan Kuo, Cho-Chun Chiu, Daniel Kifer, Michael Hay, and Ashwin Machanavajjhala.\nDifferentially private hierarchical count-of-counts histograms. In International Conference on\nVery Large Databases (VLDB), 2018.\n\n[21] Maxim Raginsky. Strong data processing inequalities and \u03c6-sobolev inequalities for discrete\n\nchannels. IEEE Transactions on Information Theory, 62(6):3355\u20133389, 2016.\n\n[22] Adam Smith, Abhradeep Thakurta, and Jalaj Upadhyay. Is interaction necessary for distributed\n\nprivate learning? In Symposium on Security and Privacy (SP), 2017.\n\n[23] Jonathan Ullman. Tight lower bounds for locally differentially private selection. arXiv preprint\n\narXiv:1802.02638, 2018.\n\n[24] Salil Vadhan. The complexity of differential privacy.\n\nCryptography, pages 347\u2013450. Springer, 2017.\n\nIn Tutorials on the Foundations of\n\n10\n\n\f", "award": [], "sourceid": 1704, "authors": [{"given_name": "Matthew", "family_name": "Joseph", "institution": "University of Pennsylvania"}, {"given_name": "Janardhan", "family_name": "Kulkarni", "institution": "Microsoft Research"}, {"given_name": "Jieming", "family_name": "Mao", "institution": "Google Research"}, {"given_name": "Steven", "family_name": "Wu", "institution": "University of Minnesota"}]}