NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:1414
Title:Minimax Optimal Estimation of Approximate Differential Privacy on Neighboring Databases

Reviewer 1

I have read the rebuttal. The results are technically interesting, in particular I found the result on sample size amplification from 1/n to 1/(n ln n) pleasantly surprising. However, the paper does not provide an end-to-end solution to the problem of detecting DP violation. --------------- The paper investigates the trade-off between accuracy and sample size in estimating differential privacy guarantees from a black-box access to a purportedly private mechanism. Let A be a mechanism and D and D’ be two neighboring datasets. Let P denote the output distribution A(D) and Q denote the output distribution A(D’). Now checking whether A is (eps,delta)-differentially private can be reduced to a estimating an appropriately defined divergence d_eps(P||Q). Let A be a discrete valued mechanism A whose range is over an alphabet set size S. The first result in this paper is to show that with a simple estimator it is necessary and sufficient for the sample size n to grow as e^eps S to achieve an arbitrary desired error rate (this result marginally improves an earlier upper bound of [12]). The main result of this paper is to show a minimax optimal estimator whose error rate is e^eps S/(n ln n), thus improving the dependence on the sample size from 1/n to 1/(n ln n). The results relies on a line of recent work on property estimation where the general recipe is to identify the regime where the property to be estimated is not smooth, and use functional approximation to estimate a smoothed version of the property. The phenomena of effective sample size amplification is technically interesting. My main concern is the results are narrow in their significance. In a differential privacy verification application the fact that you have to fix the neighboring dataset D’ is very restrictive, and therefore I do not see any practical consequence of this result for differential privacy (as suggested by the title and introduction).

Reviewer 2

Originality: The techniques seem to be build upon polynomial-approximation based techniques provided in [25]. If that is the case, the contribution of that work needs to be better acknowledged. In the current version, it's only introduced among several other papers on entropy estimation. Quality and Clarity: The results are clean and well presented. The reviewer enjoyed reading this work. Significance: The work is very interesting to the reviewer. However, I'm not sure about its significance, since I'm not an expert in this area. ----------------------------------------------------------------------------------------------------- Post Rebuttal: The authors have provided detailed responses and addressed my comments. The reviewer recommends acceptance.

Reviewer 3

The authors study the rates of estimating approximate differential privacy (aDP). They do so by reformulating it as a property estimation problem. I find this reduction fairly novel and ties DP to a large body of work on polynomial estimation. In property estimation, it is known that carefully trading off the bias and variance via polynomial approximation, particularly in regions of low probability, allows for obtaining the optimal min max rates. The authors follow the same recipe and show that the min max error scales as Se^\epsilon / n \log n. This result is cute, and the \log n factor is particularly interesting. This is in contrast to previous work where the only guarantees provided are empirical in nature. The paper is also overall well-written. I liked the fact that authors first prove the result where P is known, and then proceed to estimating both P and Q. I am also satisfied with their simulation and code submission. I recommend that the paper be accepted, I have some minor concerns that I will point out below. First, some typos - 89: ... improved ... 182: The method ... corresponding ... detail in ... 211: ... is non smooth .., In the proof of Theorem 1, it is not clear to me why Poisssonisation is required. Please elaborate on this. While I realize that a naive analysis of the empirical estimate will lead to a factor of e^{2\epsilon}S/n, an elaboration on how Poissonisation allows one to overcome this exponent of 2 might be helpful. On the other hand if it is simply a case of an easier analysis, it might be instructive to mention the same. Some intuition on why the degree of the polynomial K, must scale as \log n might be helpful. This may be presented near Theorem 2 or Theorem 4. Under case 1 and 2 under 2.2.1, it might help to write a line or two why the particular ranges of (x,e^epislon y) are considered. In its present state these seem to appear out of the blue. Post Rebuttal: The authors have addressed my concerns. If the paper is still on the borderline I would definitely recommend an accept.