{"title": "Minimax Optimal Estimation of Approximate Differential Privacy on Neighboring Databases", "book": "Advances in Neural Information Processing Systems", "page_first": 2417, "page_last": 2428, "abstract": "Differential privacy has become a widely accepted notion of privacy, leading to the introduction and deployment of numerous privatization mechanisms. However, ensuring the privacy guarantee is an error-prone process, both in designing mechanisms and in implementing those mechanisms. Both types of errors will be greatly reduced, if we have a data-driven approach to verify privacy guarantees, from a black-box access to a mechanism. We pose it as a property estimation problem, and study the fundamental trade-offs involved in the accuracy in estimated privacy guarantees and the number of samples required. We introduce a novel estimator that uses polynomial approximation of a carefully chosen degree to optimally trade-off bias and variance. With n samples, we show that this estimator achieves performance of a straightforward plug-in estimator with n*log(n) samples, a phenomenon referred to as effective sample size amplification. The minimax optimality of the proposed estimator is proved by comparing it to a matching fundamental lower bound.", "full_text": "Minimax Optimal Estimation of Approximate\nDifferential Privacy on Neighboring Databases\n\nXiyang Liu\n\nSewoong Oh\n\nAllen School of Computer Science and Engineering,\n\nUniversity of Washington\n\n{xiyangl, sewoong}@cs.washington.edu\n\nAbstract\n\nDifferential privacy has become a widely accepted notion of privacy, leading to\nthe introduction and deployment of numerous privatization mechanisms. How-\never, ensuring the privacy guarantee is an error-prone process, both in designing\nmechanisms and in implementing those mechanisms. Both types of errors will be\ngreatly reduced, if we have a data-driven approach to verify privacy guarantees,\nfrom a black-box access to a mechanism. We pose it as a property estimation prob-\nlem, and study the fundamental trade-offs involved in the accuracy in estimated\nprivacy guarantees and the number of samples required. We introduce a novel\nestimator that uses polynomial approximation of a carefully chosen degree to op-\ntimally trade-off bias and variance. With n samples, we show that this estimator\nachieves performance of a straightforward plug-in estimator with n ln n samples,\na phenomenon known as sample size ampli\ufb01cation. The minimax optimality of\nthe estimator is proved by comparing it to a matching fundamental lower bound.\n\n1\n\nIntroduction\n\nDifferential privacy is gaining popularity as an agreed upon measure of privacy leakage, widely\nused by the government to publish Census statistics [1], Google to aggregate user\u2019s choices in web-\nbrowser features [2, 3], Apple to aggregate mobile user data [4], and smart meters in telemetry [5].\nAs increasing number of privatization mechanisms are introduced and deployed in the wild, it is\ncritical to have countermeasures to check the \ufb01delity of those mechanisms. Such techniques will\nallow us to hold accountable the deployment of privatization mechanisms if the claimed privacy\nguarantees are not met, and help us \ufb01nd and \ufb01x bugs in implementations of those mechanisms.\nA user-friendly tool for checking privacy guarantees is necessary for several reasons. Writing a\nprogram for a privatization mechanism is error-prone, as it involves complex probabilistic compu-\ntations. Even with customized languages for differential privacy, checking the end-to-end privacy\nguarantee of an implementation remains challenging [6, 7]. Furthermore, even when the implemen-\ntation is error-free, there have been several cases where the mechanism designers have made errors\nin calculating the privacy guarantees, and falsely reported higher level of privacy [8, 9]. This is\nevidence of an alarming issue that analytically checking the proof of a privacy guarantee is a chal-\nlenging process even for an expert. An automated and data-driven algorithm for checking privacy\nguarantees will signi\ufb01cantly reduce such errors in the implementation and the design. On other\ncases, we are given very limited information about how the mechanism works, like Apple\u2019s white\npaper [4]. The users are left to trust the claimed privacy guarantees.\nTo address these issues, we propose a data-driven approach to estimate how much privacy is guar-\nanteed, from a black-box access to a purportedly private mechanism. Our approach is based on\nan optimal polynomial approximation that gracefully trades off bias and variance. We study the\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\ffundamental limit of how many samples are necessary to achieve a desired level of accuracy in the\nestimation, and show that the proposed approach achieves this fundamental bound.\nProblem formulation. Differential privacy (DP) introduced in [10] is a formal mathematical notion\nof privacy that is widespread, due to several key advantages. It gives one of the strongest guarantees,\nallows for precise mathematical analyses, and is intuitive to explain even to non-technical end-users.\nWhen accessing a database through a query, we say the query output is private if the output did not\nreveal whether a particular person\u2019s entry is in the database or not. Formally, we say two databases\nare neighboring if they only differ in one entry (one row in a table, for example). Let PQ,D denote\nthe distribution of the randomized output to a query Q on a database D. We consider discrete valued\nmechanisms taking one of S values, i.e. the response to a query is in [S] = {1, . . . , S} for some\ninteger S. We say a mechanism guarantees (\u03b5, \u03b4)-DP, [10], if the following holds\n\n(1)\nfor some \u03b5 \u2265 0, \u03b4 \u2208 [0, 1], and all subset E \u2286 [S] and for all neighboring databases D and D\u2032.\nWhen \u03b4 = 0, (\u03b5, 0)-DP is referred to as (pure) differential privacy, and the general case of \u03b4 \u2265 0 is\nreferred to as approximate differential privacy. For pure DP, the above condition can be relaxed as\n\nPQ,D(E) \u2264 e\u03b5PQ,D\u2032 (E) + \u03b4 ,\n\nPQ,D(x) \u2264 e\u03b5PQ,D\u2032 (x) ,\n\n(2)\nfor all output symbol x \u2208 [S], and for all neighboring databases D and D\u2032. This condition can now\nbe checked, one symbol x at a time from [S], without having to enumerate all subsets E \u2286 [S]. This\nnaturally leads to the following algorithm.\nFor a query Q and two neighboring databases D and D\u2032 of interest, we need to verify the condition\nin Eq. (2). As we only have a black-box access to the mechanism, we collect n responses from\nthe mechanism on the two databases. We check the condition on the empirical distribution of those\ncollected samples, for each x \u2208 [S]. If it is violated for any x, we assert the mechanism to be\nnot (\u03b5, 0)-DP and present x as an evidence. Focusing only on pure DP, [11] proposed an approach\nsimilar to this, where they also give guidelines for choosing the databases D and D\u2032 to test. However,\ntheir approach is only evaluated empirically, no statistical analysis is provided, and a more general\ncase of approximate DP is left as an open question, as the condition in Eq. (1) cannot be decoupled\nlike Eq. (2) when \u03b4 > 0.\nWe propose an alternative approach from \ufb01rst principles to check the general approximate DP guar-\nantees, and prove its minimax optimality. Given two probability measures P = [p1, . . . , pS] and\nQ = [q1, . . . , qS] over [S] = {1, . . . , S}, we de\ufb01ne the following approximate DP divergence with\nrespect to \u03b5 as\n\nd\u03b5(P%Q) \u225c\n\nS!i=1\n\n[pi \u2212 e\u03b5qi]+ = Ex\u223cP\"\"1 \u2212 e\u03b5 qx\n\npx#+# .\n\n(3)\n\nwhere [x]+ = max{x, 0}. The last representation indicates that this metric falls under a broader\nclass of metrics known as f-divergences, with a special choice of f (x) = [1 \u2212 e\u03b5x]+. From the\nde\ufb01nition of DP, it follows that a mechanism is (\u03b5, \u03b4)-DP if and only if d\u03b5(PQ,D%PQ,D\u2032 ) \u2264 \u03b4 for\nall neighboring databases D and D\u2032. We propose estimating this divergence d\u03b5(PQ,D%PQ,D\u2032 ) from\nsamples, and comparing it to the target \u03b4. This only requires number of operations scaling as S ln n\nwhere n is the sample size.\nIn this paper, we suppose there is a speci\ufb01c query Q of interest, and two neighboring databases D and\nD\u2032 have been already selected either by a statistician who has some side information on the structure\nof the mechanism or by some algorithm, such as those from [11, 12, 13]. Without exploiting the\nstructure (such as symmetry, exchangeability, or invariance to the entries of the database conditioned\non the true output of the query), one cannot avoid having to check all possible combinations of\nneighboring databases. As a remedy, [12] proposes checking randomly selected databases. This\nin turn ensures a relaxed notion of privacy known as random differential privacy. Similarly, [13]\nproposed checking the typical databases, assuming there we have access to a prior distribution over\nthe databases. Our framework can be seamlessly incorporated with such higher-level routines to\nselect databases.\nContributions. We study the problem of estimating the approximate differential privacy guaranteed\nby a mechanism, from a black-box access where we can sample from the mechanism output given a\n\n2\n\n\fquery Q, a database D, and a target (\u03b5, \u03b4). We \ufb01rst show that a straightforward plug-in estimator of\nd\u03b5(P%Q) achieves mean squared error scaling as (e\u03b5S)/n, where S is the size of the alphabet and\nn is the number of samples used (Section 2.1.1).\nIn the regime where we \ufb01x S and increase the sample size, this achieves the parametric rate of 1/n,\nand cannot be improved upon. However, in many cases of practical interest where S is comparable\nto n, we show that this can be improved upon with a more sophisticated estimator. To this end, we\nintroduce a novel estimator of d\u03b5(P%Q). The main idea is to identify the regimes of non-smoothness\nin [pi \u2212 e\u03b5qi]+ where the plug-in estimator has a large bias. We replace it by the uniformly best\npolynomial approximation of the non-smooth regime of the function, and estimate those polynomial\nfrom samples. By selecting appropriate degree of the polynomial, we can optimally trade off the\nbias and variance. We provide an upper bound on the error scaling as (e\u03b5S)/(n ln n), when S and\nn are comparable. We prove that this is the best one can hope for, by providing a matching lower\nbound.\nWe \ufb01rst show this for the case when we know P and sample from Q in Section 2.1, to explain\nthe main technical insights while maintaining simple exposition. Then, we consider the practical\nscenario where both P and Q are accessed via samples, and provide an minimax optimal estimator\nin Section 2.2. This phenomenon is referred to as effective sample size ampli\ufb01cation; one can achieve\nwith n samples a desired error rate, that would require n ln n samples for a plug-in estimator. We\npresent numerical experiments supporting our theoretical predictions in Section 3.\nRelated work. A formal investigation into verifying DP guarantees of a given mechanism was\naddressed in [13]. DP condition is translated into a certain Lipschitz condition on PQ,D over the\ndatabases D, and a Lipschitz tester is proposed to check the conditions. However, this approach\nis not data driven, as it requires the knowledge of the distribution PQ,D and no sampling of the\nmechanism outputs is involved. [12] analyzes tradeoffs involved in testing DP guarantees. It is\nshown that one cannot get accurate testing without sacri\ufb01cing the privacy of the databases used in\nthe testing. Hence, when testing DP guarantees, one should not use databases that contain sensitive\ndata. We compare some of the techniques involved in Section 2.1.1.\nOur techniques are inspired by a long line of research in property estimation of a distribution from\nsamples. There has been signi\ufb01cant recent advances for high-dimensional estimation problems,\nstarting from entropy estimation in [14, 15, 16]. The general recipe is to identify the regime where\nthe property to be estimated is not smooth, and use functional approximation to estimate a smoothed\nversion of the property. This has been widely successful in support recovery [17], density estimation\nwith \u21131 loss [18], and estimating Renyi entropy [19]. More recently, this technique has been applied\nto estimate certain divergences between two unknown distributions, for Kullback-Leibler divergence\n[20], total variation distance [21], and identity testing [22]. With carefully designed estimators, these\napproximation-based approaches can achieve improvement over typical parametric rate of 1/n error\nrate, sometimes referred to as effective sample size ampli\ufb01cation.\nNotations. We let the alphabet of a discrete distribution be [S] = {1, . . . , S} for some positive\ninteger S denoting the size of the alphabet. We let MS denote the set of probability distributions\nover [S]. We use f (n) \u2273 g(n) to denote that supn f (n)/g(n) \u2265 C for some constant C, and\nf (n) \u2272 g(n) is analogously de\ufb01ned. f (n) \u224d g(n) denotes that f (n) \u2273 g(n) and f (n) \u2272 g(n).\n2 Estimating differential privacy guarantees from samples\n\nWe want to estimate d\u03b5(P%Q) from a blackbox access to the mechanism outputs accessing two\ndatabases, i.e. P = PQ,D and Q = PQ,D\u2032. We \ufb01rst consider a simpler case, where P = [p1, . . . , pS]\nis known and we observe samples from an unknown distribution Q = [q1, . . . , qS] in Section 2.1.\nWe cover this simpler case \ufb01rst to demonstrate the main ideas on the algorithm design and analysis\ntechnique while maintaining the exposition simple. This paves the way for our main algorithmic and\ntheoretical results in Section 2.2, where we only have access to samples from both P and Q.\n\n2.1 Estimating d\u03b5(P%Q) with known P\nFor a given budget n, representing an upper bound on the expected number of samples we can\ncollect, we propose sampling a random number N of samples from Poisson distribution with mean\nn, i.e. N \u223c Poi (n). Then, each sample Xj \u2208 [S] is drawn from Q for j \u2208 {1, . . . , N}, and we let\n\n3\n\n\fi=1, making the analysis simpler.\n\nQn = [\u02c6q1, . . . , \u02c6qS] denote the resulting histogram divided by n, such that \u02c6qi \u225c |{j \u2208 [N ] : Xi =\nj}|/n. Note that Qn is not the standard empirical distribution, as$i \u02c6qi \u2215= 1 with high probability.\nHowever, in this paper we refer to Qn as empirical distribution of the samples. The empirical\ndistribution would have been divided by N instead of n. Instead, Qn is the maximum likelihood\nestimate of the true distribution Q. This Poisson sampling, together with the MLE construction of\nQn, ensures independence among {\u02c6qi}S\n2.1.1 Performance of the plug-in estimator\nThe following result shows that it is necessary and suf\ufb01cient to have n \u2248 e\u03b5 S samples to achieve an\narbitrary desired error rate, if we use this plug-in estimator d\u03b5(P%Qn), under the worst-case P and\nQ. Some assumption on (P, Q) is inevitable as it is trivial to achieve zero error for any sample size,\nfor example if P and Q have disjoint supports. Both d\u03b5(P%Q) and d\u03b5(P%Qn) are 1 with probability\none. We provide a proof in Appendix C.2. The bound in Eq. (4) also holds for d\u03b5(Pn%Q).\nTheorem 1. For any \u03b5 \u2265 0 and support size S \u2208 Z+, if n \u2265 e\u03b5S, then the plug-in estimator satis\ufb01es\n(4)\n\nsup\n\n.\n\nEQ%|d\u03b5(P%Qn) \u2212 d\u03b5(P%Q)|2& \u224d\n\ne\u03b5S\nn\n\nP,Q\u2208MS\n\nA similar analysis was done in [12], which gives an upper bound scaling as e2\u03b5S/n. We tighten the\nanalysis by a factor of e\u03b5, and provide a matching lower bound.\n\n2.1.2 Achieving optimal sample complexity with a polynomial approximation\n\nWe construct a minimax optimal estimator using techniques \ufb01rst introduced in [16, 15] and adopted\nin several property estimation problems including [18, 20, 19, 21, 22, 17].\n\nAlgorithm 1 Differential Privacy (DP) estimator with known P\nInput: target privacy \u03b5 \u2208 R+, query Q, neighboring databases (D,D\u2032), pmf of PQ,D\nsamples from PQ,D\u2032, degree K \u2208 Z+, constants c1, c2 \u2208 R+, expected sample size 2n\nOutput: estimate 'd\u03b5,K,c1,c2 (P%Qn) of d\u03b5(PQ,D%PQ,D\u2032 )\nP \u2190 PQ,D\nDraw two independent sample sizes: N1 \u2190 Poi (n) and N2 \u2190 Poi (n)\ni=1 \u2208 [S]N1 and {Xi,2}N2\nSample from PQ,D\u2032: {Xi,1}N1\n\u02c6qi,j \u2190 |{\u2113\u2208[Nj ]:X\u2113,j =i}|\nfor all i \u2208 [S] and j \u2208 {1, 2}\nQn,1 \u2190 [\u02c6q1,1, . . . , \u02c6qS,1] and Qn,2 \u2190 [\u02c6q1,2, . . . , \u02c6qS,2]\nfor i = 1 to S do\n\ni=1 \u2208 [S]N2\n\nn\n\n0\n,\n\u02dcDK(\u02c6qi,2; pi)\n,\n[pi \u2212 e\u03b5 \u02c6qi,2]+ ,\n\nif \u02c6qi,1 > U (pi; c1, c2)\nif \u02c6qi,1 \u2208 U (pi; c1, c2)\nif \u02c6qi,1 < U (pi; c1, c2)\n\n(de\ufb01ned in Appendix A)\n\n\u03b4i \u2190()*\n\nend for\n\n'd\u03b5,K,c1,c2 (P%Qn) \u2190 0 \u2228 (1 \u2227$S\n\ni=1 \u03b4i)\n\nTo simplify the analysis, we split the samples randomly into two partitions, each having an inde-\npendent and identical distribution of Poi (n) samples from the multinomial distribution Q. We let\nQn,1 = [\u02c6q1,1, . . . , \u02c6qS,1] denote the count of the \ufb01rst set of N1 \u223c Poi (n) samples (normalized by\nn), and Qn,2 = [\u02c6q1,1, . . . , \u02c6qS,1] the second set of N2 \u223c Poi (n) samples. See Algorithm 1 for a\nformal de\ufb01nition. Note that for the analysis we are collecting 2n samples in total on average. In\nall the experiments, however, we apply our estimator without partitioning the samples. A major\nchallenge in achieving the minimax optimality is in handling the non-smoothness of the function\nf (\u02c6qi; pi) \u225c [pi \u2212 e\u03b5 \u02c6qi]+ at pi \u2243 e\u03b5 \u02c6qi. We use one set of samples to identify whether an out-\ncome i \u2208 [S] is in the smooth regime (\u02c6qi,1 /\u2208 U (pi; c1, c2)) or not (\u02c6qi,1 \u2208 U (pi; c1, c2)), with an\nappropriately de\ufb01ned set function:\nU (p; c1, c2) \u225c +\n\nif p \u2264 c1e\u03b5 ln n\n,\nn\n, otherwise,\n\n[0, (c1+c2) ln n\n\n(5)\n\n]\n\n,\n\n\"e\u2212\u03b5p \u2212, c2e\u2212\u03b5p ln n\n\nn\n\nn\n\n, e\u2212\u03b5p +, c2e\u2212\u03b5p ln n\n\nn\n\n#\n\n4\n\n\ffor c1 \u2265 c2 > 0 and p \u2208 [0, 1]. The scaling of the interval is chosen carefully such that (a) it is large\nenough for the probability of making a mistake on the which regime (pi, qi) falls into to vanishes\n(Lemma 13); and (b) it is small enough for the variance of the polynomial approximation in the\nnon-smooth regime to match that of the other regimes (Lemma 14). In the smooth regime, we use\nthe plug-in estimator. In the non-smooth regime, we can improve the estimation error by using the\nbest polynomial approximation of f (x; p) = [p \u2212 e\u03b5 x]+, which has a smaller bias:\n\n(6)\n\nDK(x; p) \u225c arg min\nP\u2208polyK\n\nmax\n\n\u02dcx\u2208U (p;c1,c1)-- [p \u2212 e\u03b5 \u02dcx]+ \u2212 P (\u02dcx)-- ,\n\nwhere polyK is the set of polynomial functions of degree at most K, and we approximate f (x; p)\nin an interval U (p; c1, c1) \u2283 U (p; c1, c2) for any c1 > c2. Having this slack of c1 > c2 in the\napproximation allows us to guarantee the approximation quality, even if the actual q is not exactly\nin the non-smooth regime U (p; c1, c2). Once we have the polynomial approximation, we estimate\nthis polynomial function DK(x; p) from samples, using the uniformly minimum variance unbiased\nestimator (MVUE).\nThere are several advantages that makes this two-step process attractive. As we use an unbiased\nestimate of the polynomial, the bias is exactly the polynomial approximation error of DK(x; p),\n\nwhich scales as (1/K).(pi ln n)/n. Larger degree K reduces the approximation error, and larger\n\nn reduces the support of the domain we apply the approximation to in U (p; c1, c1) (Lemma 14).\nThe variance is due to the sample estimation of the polynomial DK(x; p), which scales as\n(BKpi ln n)/n for some universal constant B (Lemma 14). Larger degree K increases the vari-\nance. We prescribe choosing K = c3 ln n for appropriate constant c3 to optimize the bias-variance\ntradeoff in Algorithm 1. The methods of constructing the polynomial approximation DK(x; p) and\ncorresponding unbiased estimator \u02dcDK(x; p) are described in details at Appendix A.\nTheorem 2. Suppose ln n \u2264 C\u2032(ln S \u2212 \u03b5) for some constants C\u2032, then there exist constants c1, c2\nand c3 that only depends on C\u2032 and \u03b5 such that\n\nsup\n\nP,Q\u2208MS\n\nEQ%--'d\u03b5,K,c1,c2 (P%Qn) \u2212 d\u03b5(P%Q)--2& \u2272 e\u03b5S\n\nn ln n\n\n.\n\n(7)\n\nfor K = c3 ln n and where 'd\u03b5,K,c1,c2 is de\ufb01ned in Algorithm 1.\n\nWe provide a proof in Appendix C.3, and a matching lower bound in Theorem 3. Note that the\nplug-in estimator in Theorem 1 achieves the parametric rate of 1/n. In the low-dimensional regime,\nwhere we \ufb01x S and grow n, this cannot be improved upon. To go beyond the parametric rate, we\nneed to consider a high-dimensional regime, where S grows with n. Hence, a condition similar to\nln n \u2264 C\u2032 ln S is necessary, although it might be possible to further relax it.\n2.1.3 Matching minimax lower bound\nIn the high-dimensional regime, where S grows with n suf\ufb01ciently fast, we can get a tighter lower\nbound then Theorem 1, that matches the upper bound in Theorem 2. Again, supremum over Q is\nnecessary as there exists (P, Q) where it is trivial to achieve zero error, for any sample size (see\nSection 2.1.1 for an example). For any given P we provide a minimax lower bound in the following.\nA proof is provided in Appendix C.4.\nTheorem 3. Suppose S \u2265 2 and there exists constants c, C1, C2 > 0 such that C1 ln S \u2264 ln n \u2264\nC2 ln S and n \u2265 c(e\u03b5S)/ln S, then\nsup\nQ\u2208MS\n\nEQ%--'d\u03b5(P%Qn) \u2212 d\u03b5(P%Q)--2& \u2273 e\u03b5S\n\nwhere the in\ufb01mum is taken over all possible estimators.\n\n!d\u03b5(P%Qn)\n\nsup\nP\u2208MS\n\n.\n\nn ln n\n\n(8)\n\ninf\n\n2.2 Estimating d\u03b5(P%Q) from samples\nWe now consider the general case where P = PQ,D and Q = PQ,D\u2032 are both unknown, and we\naccess them through samples. We propose sampling a random number of samples N1 \u223c Poi (n)\nand N2 \u223c Poi (n) from each distribution, respectively. De\ufb01ne the empirical distributions Pn =\n\n5\n\n\f[\u02c6p1, . . . , \u02c6pS] and Qn = [\u02c6q1, . . . , \u02c6qS] as in the previous section. From the proof of Theorem 1, we get\nthe same sample complexity for the plug-in estimator: If n \u2265 e\u03b5S and S \u2265 2, we have\n\nsup\n\nP,Q\u2208MS\n\nEQ%|d\u03b5(Pn%Qn) \u2212 d\u03b5(P%Q)|2& \u224d\n\ne\u03b5S\nn\n\n.\n\n(9)\n\nUsing the same two-step process, we construct an estimator that improves upon this parametric rate\nof plug-in estimator.\n\n2.2.1 Estimator for d\u03b5(P%Q)\nWe present an estimator using similar techniques as in Algorithm 1, but there are several challenges\nin moving to a multivariate case. The multivariate function f (x, y) = [x \u2212 e\u03b5y]+ is non-smooth in\na region x = e\u03b5y. We \ufb01rst de\ufb01ne a two-dimensional non-smooth set U (c1, c2) \u2282 [0, 1] \u00d7 [0, e\u03b5] as\n(\u221ap + \u221ae\u03b5q), p \u2208 [0, 1], q \u2208 [0, 1]0 , (10)\n\nU (c1, c2) = +(p, e\u03b5q) : |p \u2212 e\u03b5q | \u2264/ (c1 + c2) ln n\n\nn\n\nwhere 0 < c2 < c1. As before, the plug-in estimator is good enough in the smooth regime,\ni.e. (p, e\u03b5q) /\u2208 U (c1, c2).\nWe construct a polynomial approximation of this function with order K, in this non-smooth regime.\nWe will set K = c3 ln n again to achieve the optimal tradeoff. We split the samples randomly into\nfour partitions, each having an independent and identical distribution of Poi (n) samples, two from\nthe multinomial distributions P and other two from Q. See Algorithm 2 for a formal de\ufb01nition. We\nuse one set of samples to identify the regime, and the other for estimation. We give a full description\nand justi\ufb01cation of the algorithm in the longer version of this paper [23].\n\nAlgorithm 2 Differential Privacy (DP) estimator\nInput: target privacy \u03b5 \u2208 R+, query Q, neighboring databases (D,D\u2032),\nsamples from PQ,D and PQ,D\u2032, degree K \u2208 Z+, constants c1, c2 \u2208 R+, expected sample size 2n\nOutput: estimate 'd\u03b5,K,c1,c2 (Pn%Qn) of d\u03b5(PQ,D%PQ,D\u2032 )\nP \u2190 PQ,D, Q \u2190 PQ,D\u2032\nDraw four independent sample sizes: N1,1, N1,2, N2,1, N2,2 \u223c Poi (n)\ni=1 \u2208 [S]N1,1 and {Xi,2}N1,2\nSample from PQ,D: {Xi,1}N1,1\nSample from PQ,D\u2032: {Yi,1}N2,1\ni=1 \u2208 [S]N2,1 and {Yi,2}N2,2\n\u02c6pi,j \u2190 |{\u2113\u2208[N1,j ]:X\u2113,j =i}|\nand \u02c6qi,j \u2190 |{\u2113\u2208[N2,j ]:Y\u2113,j =i}|\nPn,1 \u2190 [\u02c6p1,1, . . . , \u02c6pS,1], Pn,2 \u2190 [\u02c6p1,2, . . . , \u02c6pS,2], Qn,1 \u2190 [\u02c6q1,1, . . . , \u02c6qS,1] and Qn,2 \u2190\n[\u02c6q1,2, . . . , \u02c6qS,2]\nfor i = 1 to S do\nif \u02c6pi,1 \u2212 e\u03b5 \u02c6qi,1 < \u2212, (c1+c2) ln n\n(.\u02c6pi,1 +.e\u03b5 \u02c6qi,1)\nif \u02c6pi,1 \u2212 e\u03b5 \u02c6qi,1 >, (c1+c2) ln n\n(.\u02c6pi,1 +.e\u03b5 \u02c6qi,1)\nif \u02c6pi,1 + e\u03b5 \u02c6qi,1 < c1 ln n\nif (\u02c6pi,1, e\u03b5 \u02c6qi,1) \u2208 U (c1, c2), \u02c6pi,1 + e\u03b5 \u02c6qi,1 \u2265 c1 ln n\n\ni=1 \u2208 [S]N1,2\ni=1 \u2208 [S]N2,2\nfor all i \u2208 [S] and j \u2208 {1, 2}\n\n\u03b4i \u2190\n\n0\n\u02c6pi,2 \u2212 e\u03b5 \u02c6qi,2\nK (\u02c6pi,2, \u02c6qi,2)\nK (\u02c6pi,2, \u02c6qi,2; \u02c6pi,1, \u02c6qi,1)\n\n\u02dcD(1)\n\n\u02dcD(2)\n\n,\n\n,\n\n,\n,\n\nn\n\nn\n\nn\n\nn\n\nn\n\nn\n\n(1111)1111*\n\nend for\n\n'd\u03b5,K,c1,c2 (Pn%Qn) \u2190 0 \u2228 (1 \u2227$S\n\ni=1 \u03b4i)\n\ncase 1: For (x, e\u03b5y) \u2208 [0, (2c1 ln n)/n]2. A straightforward polynomial approximation of [x \u2212\ne\u03b5y]+ on [0, (2c1 ln n)/n]2 cannot achieve approximation error smaller than (1/K)((2c1 ln n)/n).\nAs K = c3 ln n, this gives a bias of 1/n for each symbol in [S], resulting in total bias of S/n. This\nrequires n \u226b S to achieve arbitrary small error, as opposed to n \u226b S/ ln S which is what we are\ntargeting. This is due to the fact that we are requiring multivariate approximation, and the bias is\ndominated by the worst case y for each x. If y is \ufb01xed, as in the case of univariate approximation in\nLemma 14, the bias would have been (1/K).(e\u03b5y2c1 ln n)/n, with y = qi, where total bias scales\nas.S/n ln n when summed over all symbols i.\n\n6\n\n\fOur strategy is to use the decomposition [x\u2212 e\u03b5y]+ = (\u221ax +\u221ae\u03b5y) [\u221ax \u2212 \u221ae\u03b5y]+. Each function\ncan be approximated up to a bias of (1/K).(ln n)/n, and the dominant term in the bias becomes\n(1/K).(e\u03b5qi ln n)/n. This gives the desired bias. Concretely, we use two bivariate polynomials\nuK(x, y) and vK(x, y) to approximate \u221ax + \u221ay and [\u221ax \u2212 \u221ay]+ in [0, 1]2, respectively. Namely,\n(x\u2032,y\u2032)\u2208[0,1]2---P (x\u2032, y\u2032) \u2212 (\u221ax\u2032 +.y\u2032)--- , and (11)\n(x,y)\u2208[0,1]2--uK(x, y) \u2212 (\u221ax + \u221ay)-- =\n(x\u2032,y\u2032)\u2208[0,1]2---P (x\u2032, y\u2032) \u2212 [\u221ax\u2032 \u2212.y\u2032]+--- . (12)\n(x,y)\u2208[0,1]2--vK(x, y) \u2212 [\u221ax \u2212 \u221ay]+-- =\n\nP\u2208poly2\n\nP\u2208poly2\n\nsup\n\nsup\n\nsup\n\nsup\n\ninf\n\ninf\n\nK\n\nK\n\nDenote h2K(x, y) = uK(x, y)vK(x, y) \u2212 uK(0, 0)vK(0, 0). De\ufb01ne\ne\u03b5yn\n\n2c1 ln n\n\nh2K2 xn\n\n2c1 ln n\n\n,\n\n2c1 ln n3 ,\n\nn\n\nD(1)\n\nK (x, y) =\n\nfor (x, e\u03b5y) \u2208 [0, (2c1 ln n)/n]2. In practice, one can use the best Chebyshev polynomial expansion\nto achieve the same uniform error rate, ef\ufb01ciently [24].\ncase 2: For (x, e\u03b5y) \u2208 U (c1, c1) and x + e\u03b5y \u2265 (c1 ln n)/2n. We utilize the best polynomial\napproximation of |t| on [\u22121, 1] with order K. Denote it as RK(t) =$K\nrjW \u2212j+1(e\u03b5y \u2212 x)j +\n\nK (x, y; \u02c6pi,1, \u02c6qi,1) =\n\nj=0 rjtj. De\ufb01ne\n\nx \u2212 e\u03b5y\n\nD(2)\n\n(14)\n\n1\n2\n\n2\n\n,\n\nwhere W =.(8c1 ln n)/n4.\u02c6pi,1 + e\u03b5 \u02c6qi,15. Finally, we use second part of samples to construct\n\nK (x, y; \u02c6pi,1, \u02c6qi,1) by Lemma 11 and 12 . Namely,\n\nunbiased estimator for D(1)\n\nK (x, y) and D(2)\n\nK!j=0\n\n(13)\n\n(15)\n\n(16)\n\nE\" \u02dcD(1)\n\nK (\u02c6pi,2, \u02c6qi,2)# = D(1)\nK (\u02c6pi,2, \u02c6qi,2; \u02c6pi,1, \u02c6qi,1)|\u02c6pi,1, \u02c6qi,1# = D(2)\n\nE\" \u02dcD(2)\n\nK (p, q) , and\n\nK (p, q; \u02c6pi,1, \u02c6qi,1) .\n\nThe formula for the unbiased estimators can be found in the Appendix in Eqs. (180) and (187).\n\n2.2.2 Minimax optimal upper bound\nWe provide an upper bound on the error achieved by the proposed estimator. The analysis uses\nsimilar techniques as the proof of Theorem 2. We provide a proof in Appendix C.5.\nTheorem 4. Suppose there exists a constant C > 0 such that ln n \u2264 C ln S. Then there exist\nconstants c1, c2 and c3 that only depends on C and \u03b5 such that\n\nsup\n\nP,Q\u2208MS\n\nEP\u00d7Q%--'d\u03b5,K,c1,c2 (Pn%Qn) \u2212 d\u03b5(P%Q)--2& \u2272 e\u03b5S\n\nn ln n\n\nfor K = c3 ln n where 'd\u03b5,K,c1,c2 is de\ufb01ned in Algorithm 2.\n\nIt follows from the proof of Theorem 3 that\n\n,\n\n(17)\n\ninf\n\n!d\u03b5(Pn%Qn)\n\nsup\n\nP,Q\u2208MS\n\nEP\u00d7Q%--'d\u03b5(Pn%Qn) \u2212 d\u03b5(P%Q)--2& \u2273 e\u03b5S\n\nn ln n\n\n.\n\nTogether, the above upper and lower bounds prove that the proposed estimator in Algorithm 2 is\nminimax optimal and cannot be improved upon in terms of sample complexity. We want to em-\nphasize that we do not require to know the size of the support S, as opposed to exiting methods in\n[11], which requires collecting enough samples to identify the support. Comparing it to the error\nrate of plug-in estimator in Theorem 1, this minimax rate of e\u03b5S/(n ln n) demonstrates the effective\nsample size ampli\ufb01cation holds; with n samples, a sophisticated estimator can achieve the error rate\nequivalent to a plug-in estimator with n ln n samples.\n\n7\n\n\f3 Experiments\n\nWe present the experiment details in Appendix B and the code to reproduce our experiments at\nhttps://github.com/xiyangl3/adp-estimator. Figure 1 (a) illustrates the Mean Square Er-\nror (MSE) for estimating d\u03b5(P%Q) between uniform distribution P and Zipf distribution Q, where\nthe support size is \ufb01xed to be S = 100, Zipf(\u03b1) \u221d 1/i\u03b1, and \u03b1 = \u22120.6 for i \u2208 [S]. The \u03b5 is \ufb01xed\nto be \u03b5 = 0.4. This suggests that the Algorithm 2 consistently improves upon the plug-in estimator,\nas predicted by Theorem 4.\nWe demonstrate how we can use Algorithm 2 to detect mechanisms with false claim of DP guaran-\ntees on four types of mechanisms: Report Noisy Max [25], Histogram [26], Sparse Vector Technique\n[8] and Mixture of Truncated Geometric Mechanism. We closely following the experimental set-up\nof [11], and the settings and discussions are provided in Appendix B.2.\nIn [11], the test query and databases de\ufb01ning (Q,D,D\u2032) are chosen by some heuristics. Figure 1 (b)\nand (c) show (\u03b5, \u03b4) regions for the variations of noisy max mechanisms for privacy budget \u03b50 = 0.3.\nFrom the \ufb01gures, one can easily con\ufb01rm that RNM+Lap and RNM+Exp have \u02c6\u03b4 \u226b 0 at \u03b5 = 0.3 (blue\nlines), con\ufb01rming that these two mechanisms do not guarantee the claimed (0.3, 0)-DP, as known\nin the literature [11]. For those faulty mechanisms, Algorithm 2 also provides a certi\ufb01cate in the\nform of a set T \u2286 [S] such that\n[P (T ) \u2212 e\u03b5Q(T )]+ \u2212 \u03b4 > 0. With the setting of privacy budget\n\u03b50 = 0.5, Figure 1 (d) shows that the incorrect histogram with incorrect Lap(\u03b50) noise is likely to\nbe (1/\u03b50, 0)-DP as known from [11]. Both mechanisms claim (0.5, 0)-DP, but the \ufb01gure shows that\nthe incorrect mechanism ensures (1/0.5, 0)-DP instead. Figure 1 (e) shows that SVT is likely to be\n(\u03b50, 0)-DP with \u03b50 = 0.5. However, iSVT1, iSVT2, and iSVT3 do not meet the claimed (0.5, 0)-DP.\nFigure 1 (f) con\ufb01rms that MTGM satis\ufb01ed the claimed (\u03b50, \u03b40) differential privacy.\n\nMSE\n\n\u02c6\u03b4\n\n(a)\n\nsample size n\n\n(d)\n\n\u03b5\n\n\u02c6\u03b4\n\n\u02c6\u03b4\n\n(b)\n\n\u03b5\n(e)\n\n\u03b5\n\n\u02c6\u03b4\n\n\u02c6\u03b4\n\n(c)\n\n\u03b5\n(f)\n\n\u03b5\n\nFigure 1: (a) shows the proposed minimax optimal estimator in Algorithm 2 consistently improves\nupon the plug-in estimator on synthetic data. Each data point represents 100 random trials, with\nstandard error (SE) error bars smaller than the plot marker. (b), (c), (d), (e), (f) estimate \u02c6\u03b4 from Al-\ngorithm 2 of \u03b4 given \u03b5, and privacy budget \u03b50 for DP mechanisms. Each point is showing an average\nover 10 random trials with standard error. The red lines represent the original correct mechanisms.\nAlgorithm 2 allows us to detect violation of claimed DP guarantees.\n\n4 Conclusion\n\nWe investigate the fundamental trade-off between accuracy and sample size in estimating differential\nprivacy guarantees from a black-box access to a purportedly private mechanism. Such a data-driven\n\n8\n\n\fapproach to verifying privacy guarantees will allow us to hold accountable the mechanisms in the\nwild that are not faithful to the claimed privacy guarantees, and help \ufb01nd and \ufb01x bugs in either\nthe design or the implementation. To this end, we propose a polynomial approximation based ap-\nproach to estimate the differential privacy guarantees. We show that in the high-dimensional regime,\nthe proposed estimator achieves sample size ampli\ufb01cation effect. Compared to the parametric rate\nachieved by the plug-in estimator, we achieve a factor of ln n gain in the sample size. A matching\nlower bound proves the minimax optimality of our approach. Here, we list important remaining\nchallenges that are outside the scope of this paper.\nSince the introduction of differential privacy, there have been several innovative notions of privacy,\nsuch as puffer\ufb01sh, concentrated DP, zCDP, and Renyi DP, proposed in [27, 28, 29, 30]. Our estimator\nbuilds upon the fact that differential privacy guarantee is a divergence between two random outputs.\nThis is no longer true for the other notions of privacy, which makes it more challenging.\nCharacterizing the fundamental tradeoff for continuous mechanisms is an important problem, as\nseveral popular mechanisms output continuous random variables, such as Laplacian and Gaussian\nmechanisms. One could use non-parametric estimators such as k-nearest neighbor methods and\nkernel methods, popular for estimating information theoretic quantities and divergences [31, 32, 33,\n34]. Further, when the output is a mixture of discrete and continuous variables, recent advances in\nestimating mutual information for mixed variables provide a guideline for such complex estimation\nprocess [35].\nThere is a fundamental connection between differential privacy and ROC curves, as investigated\nin [30, 36, 37]. Binary hypothesis testing and ROC curves provide an important measure of per-\nformance in generative adversarial networks (GAN) [38]. This fundamental connection between\ndifferential privacy and GAN was \ufb01rst investigated in [39], where it was used to provide an implicit\nbias for mitigating mode collapse, a fundamental challenge in training GANs. A DP estimator, like\nthe one we proposed, provides valuable tools to measure performance of GANs. The main challenge\nis that GAN outputs are extremely high-dimensional (popular examples being 1, 024 \u00d7 1, 024,\u00d73\ndimensional images). Non-parametric methods have exponential dependence in the dimension, ren-\ndering them useless. Even some recent DP approaches have output dimensions that are equally\nlarge [40]. We need fundamentally different approach to deal with such high dimensional continu-\nous mechanisms.\nWe considered a setting where we create synthetic databases D and D\u2032 and test the guarantees of\na mechanism of interest.\nInstead, [12] assumes we do not have such a control, and the privacy\nof the real databases used in the testing needs to also be preserved. It is proven that one cannot\ntest the privacy guarantee of a mechanism without revealing the contents of the test databases. Such\nfundamental limits suggest that the samples used in estimating DP needs to be destroyed after the es-\ntimation. However, the estimated d\u03b5(PQ,D%PQ,D\u2032 ) still leaks some information about the databases\nused, although limited. This is related to a challenging task of designing mechanisms with (\u03b5, \u03b4)-DP\nguarantees when (\u03b5, \u03b4) also depends on the databases. Without answering any queries, just pub-\nlishing the guarantee of the mechanism on a set of databases reveal something about the database.\nDetection and estimation under such complicated constraints is a challenging open question.\n\nAcknowledgement\n\nThis work is partially supported by NSF awards CNS-1527754, CNS-1705007, CCF-1927712, RI-\n1929955 and generous gift from Google.\n\nReferences\n[1] John Abowd. The u.s. census bureau adopts differential privacy. KDD Invited Talk, 2018.\n\n[2] \u00dalfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. Rappor: Randomized aggregatable\nprivacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on\ncomputer and communications security, pages 1054\u20131067. ACM, 2014.\n\n[3] Giulia Fanti, Vasyl Pihur, and \u00dalfar Erlingsson. Building a rappor with the unknown: Privacy-\npreserving learning of associations and data dictionaries. Proceedings on Privacy Enhancing\nTechnologies, 2016(3):41\u201361, 2016.\n\n9\n\n\f[4] Apple Differential Privacy Team. Learning with privacy at scale. Apple Machine Learning\n\nJournal, 2017.\n\n[5] Bolin Ding, Janardhan Kulkarni, and Sergey Yekhanin. Collecting telemetry data privately. In\n\nAdvances in Neural Information Processing Systems, pages 3571\u20133580, 2017.\n\n[6] Gilles Barthe, Boris K\u00f6pf, Federico Olmedo, and Santiago Zanella Beguelin. Probabilistic\n\nrelational reasoning for differential privacy. ACM SIGPLAN Notices, 47(1):97\u2013110, 2012.\n\n[7] Yuxin Wang, Zeyu Ding, Guanhong Wang, Daniel Kifer, and Danfeng Zhang. Proving differ-\n\nential privacy with shadow execution. arXiv preprint arXiv:1903.12254, 2019.\n\n[8] Min Lyu, Dong Su, and Ninghui Li. Understanding the sparse vector technique for differential\n\nprivacy. Proceedings of the VLDB Endowment, 10(6):637\u2013648, 2017.\n\n[9] Yan Chen and Ashwin Machanavajjhala. On the privacy properties of variants on the sparse\n\nvector technique. arXiv preprint arXiv:1508.07306, 2015.\n\n[10] Cynthia Dwork. Differential privacy. Encyclopedia of Cryptography and Security, pages 338\u2013\n\n340, 2011.\n\n[11] Zeyu Ding, Yuxin Wang, Guanhong Wang, Danfeng Zhang, and Daniel Kifer. Detecting viola-\ntions of differential privacy. In Proceedings of the 2018 ACM SIGSAC Conference on Computer\nand Communications Security, pages 475\u2013489. ACM, 2018.\n\n[12] Anna C Gilbert and Audra McMillan. Property testing for differential privacy. In 2018 56th\nAnnual Allerton Conference on Communication, Control, and Computing (Allerton), pages\n249\u2013258. IEEE, 2018.\n\n[13] Kashyap Dixit, Madhav Jha, Sofya Raskhodnikova, and Abhradeep Thakurta. Testing the\nlipschitz property over product distributions with applications to data privacy. In Theory of\nCryptography Conference, pages 418\u2013436. Springer, 2013.\n\n[14] Paul Valiant and Gregory Valiant. Estimating the unseen: improved estimators for entropy and\nother properties. In Advances in Neural Information Processing Systems, pages 2157\u20132165,\n2013.\n\n[15] Jiantao Jiao, Kartik Venkat, Yanjun Han, and Tsachy Weissman. Minimax estimation of func-\ntionals of discrete distributions. IEEE Transactions on Information Theory, 61(5):2835\u20132885,\n2015.\n\n[16] Yihong Wu and Pengkun Yang. Minimax rates of entropy estimation on large alphabets via\nbest polynomial approximation. IEEE Transactions on Information Theory, 62(6):3702\u20133720,\n2016.\n\n[17] Yihong Wu and Pengkun Yang. Chebyshev polynomials, moment matching, and optimal esti-\n\nmation of the unseen. The Annals of Statistics, 47(2):857\u2013883, 2019.\n\n[18] Yanjun Han, Jiantao Jiao, and Tsachy Weissman. Minimax estimation of discrete distributions\n\nunder \u21131 loss. IEEE Transactions on Information Theory, 61(11):6343\u20136354, 2015.\n\n[19] Jayadev Acharya, Alon Orlitsky, Ananda Theertha Suresh, and Himanshu Tyagi. Estimating\nr\u00e9nyi entropy of discrete distributions. IEEE Transactions on Information Theory, 63(1):38\u2013\n56, 2017.\n\n[20] Yanjun Han, Jiantao Jiao, and Tsachy Weissman. Minimax rate-optimal estimation of diver-\n\ngences between discrete distributions. arXiv preprint arXiv:1605.09124, 2016.\n\n[21] Jiantao Jiao, Yanjun Han, and Tsachy Weissman. Minimax estimation of the l1 distance. IEEE\n\nTransactions on Information Theory, 2018.\n\n[22] Constantinos Daskalakis, Gautam Kamath, and John Wright. Which distribution distances are\nsublinearly testable? In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on\nDiscrete Algorithms, pages 2747\u20132764. SIAM, 2018.\n\n10\n\n\f[23] Xiyang Liu and Sewoong Oh. Minimax rates of estimating approximate differential privacy.\n\narXiv preprint arXiv:1905.10335, 2019.\n\n[24] Hrushikesh N Mhaskar, Paul Nevai, and Eugene Shvarts. Applications of classical approxima-\ntion theory to periodic basis function networks and computational harmonic analysis. Bulletin\nof Mathematical Sciences, 3(3):485\u2013549, 2013.\n\n[25] Cynthia Dwork, Aaron Roth, et al. The algorithmic foundations of differential privacy. Foun-\n\ndations and Trends R\u00a9 in Theoretical Computer Science, 9(3\u20134):211\u2013407, 2014.\n\n[26] Cynthia Dwork. Differential privacy.\n\nIn Proceedings of the 33rd International Conference\non Automata, Languages and Programming - Volume Part II, ICALP\u201906, pages 1\u201312, Berlin,\nHeidelberg, 2006. Springer-Verlag.\n\n[27] Daniel Kifer and Ashwin Machanavajjhala. Puffer\ufb01sh: A framework for mathematical privacy\n\nde\ufb01nitions. ACM Transactions on Database Systems (TODS), 39(1):3, 2014.\n\n[28] Cynthia Dwork and Guy N Rothblum. Concentrated differential privacy. arXiv preprint\n\narXiv:1603.01887, 2016.\n\n[29] Shuang Song, Yizhen Wang, and Kamalika Chaudhuri. Puffer\ufb01sh privacy mechanisms for\ncorrelated data. In Proceedings of the 2017 ACM International Conference on Management of\nData, pages 1291\u20131306. ACM, 2017.\n\n[30] Peter Kairouz, Sewoong Oh, and Pramod Viswanath. The composition theorem for differential\n\nprivacy. IEEE Transactions on Information Theory, 63(6):4037\u20134049, 2017.\n\n[31] Thomas B Berrett, Richard J Samworth, and Ming Yuan. Ef\ufb01cient multivariate entropy esti-\n\nmation via k-nearest neighbour distances. The Annals of Statistics, 47(1):288\u2013318, 2019.\n\n[32] Weihao Gao, Sewoong Oh, and Pramod Viswanath. Demystifying \ufb01xed k-nearest neighbor\n\ninformation estimators. IEEE Transactions on Information Theory, 64(8):5629\u20135661, 2018.\n\n[33] Weihao Gao, Sewoong Oh, and Pramod Viswanath. Breaking the bandwidth barrier: Geo-\nmetrical adaptive entropy estimation. In Advances in Neural Information Processing Systems,\npages 2460\u20132468, 2016.\n\n[34] Jiantao Jiao, Weihao Gao, and Yanjun Han. The nearest neighbor information estimator is\nadaptively near minimax rate-optimal. In Advances in neural information processing systems,\npages 3156\u20133167, 2018.\n\n[35] Weihao Gao, Sreeram Kannan, Sewoong Oh, and Pramod Viswanath. Estimating mutual in-\nIn Advances in Neural Information Processing\n\nformation for discrete-continuous mixtures.\nSystems, pages 5986\u20135997, 2017.\n\n[36] Peter Kairouz, Sewoong Oh, and Pramod Viswanath. Extremal mechanisms for local differen-\n\ntial privacy. In Advances in neural information processing systems, pages 2879\u20132887, 2014.\n\n[37] Peter Kairouz, Sewoong Oh, and Pramod Viswanath. Secure multi-party differential privacy.\n\nIn Advances in neural information processing systems, pages 2008\u20132016, 2015.\n\n[38] Mehdi SM Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, and Sylvain Gelly. Assess-\ning generative models via precision and recall. In Advances in Neural Information Processing\nSystems, pages 5228\u20135237, 2018.\n\n[39] Zinan Lin, Ashish Khetan, Giulia Fanti, and Sewoong Oh. Pacgan: The power of two samples\nin generative adversarial networks. In Advances in Neural Information Processing Systems,\npages 1498\u20131507, 2018.\n\n[40] Chong Huang, Peter Kairouz, Xiao Chen, Lalitha Sankar, and Ram Rajagopal. Generative\n\nadversarial privacy. arXiv preprint arXiv:1807.05306, 2018.\n\n[41] T. A Driscoll, N. Hale, and L. N. Trefethen. Chebfun Guide. Pafnuty Publications, 2014.\n\n11\n\n\f[42] Ben Stoddard, Yan Chen, and Ashwin Machanavajjhala. Differentially private algorithms for\n\nempirical machine learning. arXiv preprint arXiv:1411.5428, 2014.\n\n[43] Rui Chen, Qian Xiao, Yu Zhang, and Jianliang Xu. Differentially private high-dimensional\ndata publication via sampling-based inference.\nIn Proceedings of the 21th ACM SIGKDD\nInternational Conference on Knowledge Discovery and Data Mining, pages 129\u2013138. ACM,\n2015.\n\n[44] Jaewoo Lee and Christopher W Clifton. Top-k frequent itemsets via differentially private\nfp-trees. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge\ndiscovery and data mining, pages 931\u2013940. ACM, 2014.\n\n[45] Arpita Ghosh, Tim Roughgarden, and Mukund Sundararajan. Universally utility-maximizing\n\nprivacy mechanisms. SIAM Journal on Computing, 41(6):1673\u20131693, 2012.\n\n[46] Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomized algorithms\n\nand probabilistic analysis. 2005.\n\n[47] Z Ditzian. Totik, v.: Moduli of smoothness. Springer series in computational math, Springer-\n\nVerleg, 1987.\n\n[48] Ronald A DeVore and George G Lorentz. Constructive approximation, volume 303. Springer\n\nScience & Business Media, 1993.\n\n[49] T Tony Cai, Mark G Low, et al. Testing composite hypotheses, hermite polynomials and\noptimal estimation of a nonsmooth functional. The Annals of Statistics, 39(2):1012\u20131041,\n2011.\n\n[50] A.B. Tsybakov.\n\nIntroduction to Nonparametric Estimation. Springer Series in Statistics.\n\nSpringer New York, 2008.\n\n[51] Serge Bernstein. Sur la meilleure approximation de| x| par des polynomes de degr\u00e9s donn\u00e9s.\n\nActa Mathematica, 37(1):1\u201357, 1914.\n\n12\n\n\f", "award": [], "sourceid": 1414, "authors": [{"given_name": "Xiyang", "family_name": "Liu", "institution": "University of Washington"}, {"given_name": "Sewoong", "family_name": "Oh", "institution": "University of Washington"}]}