{"title": "KNG: The K-Norm Gradient Mechanism", "book": "Advances in Neural Information Processing Systems", "page_first": 10208, "page_last": 10219, "abstract": "This paper presents a new mechanism for producing sanitized statistical summaries that achieve {\\it differential privacy}, called the {\\it K-Norm Gradient} Mechanism, or KNG. This new approach maintains the strong flexibility of the exponential mechanism, while achieving the powerful utility performance of objective perturbation. KNG starts with an inherent objective function (often an empirical risk), and promotes summaries that are close to minimizing the objective by weighting according to how far the gradient of the objective function is from zero. Working with the gradient instead of the original objective function allows for additional flexibility as one can penalize using different norms. We show that, unlike the exponential mechanism, the noise added by KNG is asymptotically negligible compared to the statistical error for many problems. In addition to theoretical guarantees on privacy and utility, we confirm the utility of KNG empirically in the settings of linear and quantile regression through simulations.", "full_text": "KNG: The K-Norm Gradient Mechanism\n\nMatthew Reimherr \u2217\nDepartment of Statistics\n\nPennsylvania State University\n\nState College, PA 16802\nmreimherr@psu.edu\n\nDepartment of Statistics\n\nPennsylvania State University\n\nState College, PA 16802\n\nJordan Awan\n\nawan@psu.edu\n\nAbstract\n\nThis paper presents a new mechanism for producing sanitized statistical summaries\nthat achieve differential privacy, called the K-Norm Gradient Mechanism, or KNG.\nThis new approach maintains the strong \ufb02exibility of the exponential mechanism,\nwhile achieving the powerful utility performance of objective perturbation. KNG\nstarts with an inherent objective function (often an empirical risk), and promotes\nsummaries that are close to minimizing the objective by weighting according to\nhow far the gradient of the objective function is from zero. Working with the\ngradient instead of the original objective function allows for additional \ufb02exibility\nas one can penalize using different norms. We show that, unlike the exponential\nmechanism, the noise added by KNG is asymptotically negligible compared to the\nstatistical error for many problems. In addition to theoretical guarantees on privacy\nand utility, we con\ufb01rm the utility of KNG empirically in the settings of linear and\nquantile regression through simulations.\n\n1\n\nIntroduction\n\nThe last decade has seen a tremendous increase in research activity related to data privacy [Aggarwal\nand Philip, 2008, Lane et al., 2014, Machanavajjhala and Kifer, 2015, Dwork et al., 2017]. This drive\nhas been fueled by an increasing societal concern over the large amounts of data being collected\nby companies, governments, and scientists. These data often contain vast amounts of personal\ninformation, for example DNA sequences, images, voice recordings, electronic health records, and\ninternet usage patterns. Such data allows for great scienti\ufb01c progress by researchers and governments,\nas well as increasingly curated business strategies by companies. However, the such data also comes\nwith increased risk for privacy breaches, placing greater pressure on institutions to prevent disclosures.\nCurrently, Differential Privacy (DP) [Dwork et al., 2006] is the leading framework for formally\nquantifying privacy risk. One of the most popular methods for achieving DP is the Exponential\nMechanism, introduced by McSherry and Talwar [2007], and used in [Friedman and Schuster, 2010,\nWasserman and Zhou, 2010, Blum et al., 2013, Dwork and Roth, 2014]. A major attribute of the\nexponential mechanism that contributes to its popularity is its \ufb02exibility; it can be readily adapted and\nincorporated into most statistical analyses. In particular, its structure makes it amenable to a wide\narray of statistical and machine learning problems that are based on minimizing an objective function,\nso called \u201cm-estimators\u201d [van der Vaart, 2000, Chapter 5]. Some examples where the exponential\nmechanism has been used include PCA [Chaudhuri et al., 2013, Awan et al., 2019], hypothesis testing\n[Canonne et al., 2019], maximum likelihood estimation (related to posterior sampling) [Wang et al.,\n2015, Minami et al., 2016], and density estimation [Wasserman and Zhou, 2010].\nHowever, examples have arisen [Wang et al., 2015, Awan et al., 2019] where the magnitude of the\nnoise added by the exponential mechanism is substantially higher than other mechanisms. Recently,\n\u2217Research supported in part by NSF DMS 1712826, NSF SES 1853209, and the Simons Institute for the\n\nTheory of Computing at UC Berkeley.\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fAwan et al. [2019], demonstrated that, in a very broad sense, the exponential mechanism adds\nnoise that is not asymptotically negligible relative to the statistical estimation error, which other\nmechanism are able to achieve in different problems [e.g. Smith, 2011]. In this paper we provide\na new mechanism called the K-Norm Gradient Mechanism, or KNG, that retains the \ufb02exibility of\nthe exponential mechanism, but with substantially improved utility guarantees. KNG provides a\nprincipled approach to developing ef\ufb01cient mechanisms that also perform well in practice. Indeed the\nLaplace, K-norm, and PrivateQuantile mechanisms can all be viewed as instantiations of KNG. Here\nwe also use KNG to provide the \ufb01rst mechanism for private quantile regression that we are aware of,\nwhich we empirically show is ef\ufb01cient.\nAt a high level, KNG uses a similar perspective to that of the exponential mechanism. In particular,\nsuppose that (cid:96)n(\u03b8; D) is an objective, whose minimizer, \u02c6\u03b8 \u2208 Rd, is the summary we aim to sanitize.\nHere D represents the particular database and n the sample size of D. The exponential mechanism\naims to release \u02dc\u03b8E based on the density\n\nfE(\u03b8) \u221d exp{\u2212c0(cid:96)n(\u03b8; D)},\n\nwhere c0 is a generic constant determined by the sensitivity of (cid:96)n and the desired level of privacy.\nConceptually, the idea is to promote sanitized estimates whose utility, as measured by (cid:96)n, is close\nto that of \u02c6\u03b8. Unfortunately, Awan et al. [2019], showed that the magnitude of the noise added by\nthe exponential mechanism is often of the same order as the statistical error (as a function of n),\nresulting in inef\ufb01cient private estimators. KNG uses a similar perspective, but takes the gradient of\n(cid:96)n and promotes \u03b8 that are close to the solution \u2207(cid:96)n(\u02c6\u03b8) = 0. Since we work with the gradient, we\nalso have the \ufb02exibility of choosing a desirable norm, which Awan and Slavkovi\u00b4c [2018] showed\ncan be tailored to the problem at hand to achieve better utility. The resulting mechanism produces a\nsanitized \u02dc\u03b8 according to the density\n\nfn(\u03b8) \u221d exp{\u2212c0(cid:107)\u2207(cid:96)n(\u03b8; D)(cid:107)K},\n\nwhere (cid:107) \u00b7 (cid:107)K is a general norm on Rd that can be chosen to accommodate the context of the problem.\nHere we see a connection between KNG and the K-norm mechanism, introduced by Hardt and\nTalwar [2010]. The terminology is based on the idea of considering a set K which is the convex hull\nof the sensitivity polytope [Kattis and Nikolov, 2017], and de\ufb01ning (cid:107)\u00b7(cid:107)K to be the norm such that the\nball of radius one is K, i.e. {v \u2208 Rd | (cid:107)v(cid:107)K = 1} = K. In fact every norm can be generated in this\nmanner, so no there is no loss in generality from using this approach [Awan and Slavkovi\u00b4c, 2018].\nKNG can similarly be viewed as a modi\ufb01cation of objective perturbation [Chaudhuri et al., 2011,\nKifer et al., 2012]. There, one releases a sanitized estimate, \u02dc\u03b8O, by minimizing2\n\n(cid:0)(cid:96)n(\u03b8; D) + \u03c9\u03b8(cid:62)b(cid:1) ,\n\n\u02dc\u03b8O = argmin\u03b8\u2208\u0398\n\nwhere b \u2208 Rd is a random vector with distribution drawn from the K-norm mechanism fb(x) \u221d\nexp{\u2212(cid:107)b(cid:107)K}, and \u03c9 \u2208 R is a \ufb01xed constant based on the sensitivity of (cid:96)n and the desired level of\nn (\u2212\u03c9b),\nprivacy3. Equivalently, one has that \u2207(cid:96)n(\u02dc\u03b8O; D) + \u03c9b = 0, which implies that \u02dc\u03b8O = \u2207(cid:96)\u22121\nassuming \u2207(cid:96)n is invertible. Using the change of variables formula, this implies that \u02dc\u03b8O has density\n\nfO(\u03b8) \u221d exp{\u2212\u03c9\u22121(cid:107)\u2207(cid:96)n(\u03b8)(cid:107)K}|det(\u22072(cid:96)n(\u03b8))|.\n\nWith KNG, the second derivative term \u22072(cid:96)n is not included. Furthermore, there are several technical\nrequirements when working with objective perturbation that KNG sidesteps. In particular, the proof\nthat objective perturbation satis\ufb01es DP requires the objective function to be strongly convex and twice\ndifferentiable almost everywhere [Chaudhuri et al., 2011, Kifer et al., 2012, Awan and Slavkovi\u00b4c,\n2018]. While we assume strong convexity and a second derivative to prove a utility result in Theorem\n3.2, KNG does not require either of these conditions to satisfy DP. This allows the KNG mechanism\nto be applied in more general situations (such as median estimation and quantile regression, explored\nin Section 4), and requires fewer calculations to implement.\nThe remainder of this paper is organized as follows. In Section 2 we recall the necessary background\non differential privacy and the exponential mechanism. In Section 3 we formally de\ufb01ne KNG and\n2In fact, objective perturbation minimizes (cid:96)n(\u03b8; D) + c\u03b8(cid:62)\u03b8 + \u03c9\u03b8(cid:62)b, where c is a constant. We ignore this\n\nregularization term in this discussion for the simplicity of the illustration.\n\n3In Chaudhuri et al. [2011] and Kifer et al. [2012], the (cid:96)2 norm is used. Awan and Slavkovi\u00b4c [2018] extend\n\nobjective perturbation to allow for arbitrary norms.\n\n2\n\n\fshow that it achieves \u0001-DP with nearly the same \ufb02exibility as the exponential mechanism. We also\nprovide a general utility result that shows that the noise introduced by KNG is of order Op(n\u22121),\nwhich is negligible compared to the statistical estimation error, which is typically Op(n\u22121/2). We\nalso show that the noise introduced by KNG is asymptotically from a K-norm mechanism. In section\n4 we provide several examples of KNG applied to statistical problems, including mean estimation,\nlinear regression, median/quantile estimation, and quantile regression. We also illustrate the empirical\nadvantages of KNG in the settings of linear and quantile regression through simulations. We conclude\nin Section 5 by discussing challenges and potential extensions of KNG.\n\n2 Differential Privacy Background\n\nDifferential privacy (DP), introduced by Dwork et al. [2006] has taken hold as the primary framework\nfor formally quantifying privacy risk. Several versions of DP have been proposed, such as approximate\nDP [Dwork and Roth, 2014], concentrated DP [Dwork and Rothblum, 2016, Bun and Steinke, 2016],\nand local DP [Duchi et al., 2013], all of which \ufb01t into the axiomatic treatment of formal privacy given\nby Kifer and Lin [2012]. In this paper, we work with pure \u0001-DP, stated in De\ufb01nition 2.1.\nLet Dn denote the collection of all possible databases with n units. The bivariate function \u03b4 :\nDn \u00d7 Dn \u2192 R, which maps \u03b4(D, D(cid:48)) := #{i | Di (cid:54)= D(cid:48)\ni}, is called the Hamming Distance on Dn.\nIt is easy to verify that \u03b4 is a metric on Dn. If \u03b4(D, D(cid:48)) = 1 then D and D(cid:48) are said to be adjacent.\nLet f : Dn \u2192 \u0398 represent a summary of Dn, and F a \u03c3-algebra on \u0398, such that (\u0398,F) is a\nmeasurable space. A privacy mechanism is a family of probability measures {\u00b5D : D \u2208 Dn} over \u0398.\nDe\ufb01nition 2.1 (Differential Privacy: Dwork et al., 2006). A privacy mechanism {\u00b5D : D \u2208 Dn}\nsatis\ufb01es \u0001-Differential Privacy (\u0001-DP) if for all B \u2208 F and adjacent D, D(cid:48) \u2208 Dn,\n\n\u00b5D(B) \u2264 \u00b5D(cid:48)(B) exp(\u0001).\n\nThe exponential mechanism, introduced by McSherry and Talwar [2007] is a central tool in the design\nof DP mechanisms [Dwork and Roth, 2014]. In fact every mechanism can be viewed as an instance\nof the exponential mechanism, by setting the objective function as the log-density of the mechanism.\nIn practice, it is most common to set the objective as a natural loss function, such as an empirical risk.\nProposition 2.2 (Exponential Mechanism: McSherry and Talwar, 2007). Let (\u0398,F, \u03bd) be a measure\nspace, and let {(cid:96)n(\u03b8; D) : \u0398 \u2192 R | D \u2208 Dn} be a collection of measurable functions indexed by\nthe database D. We say that this collection has a \ufb01nite sensitivity \u2206, if\n\nfor all adjacent D, D(cid:48) and \u03bd-almost all \u03b8 \u2208 \u0398. If(cid:82)\n\n|(cid:96)n(\u03b8; D) \u2212 (cid:96)n(\u03b8; D(cid:48))| \u2264 \u2206 < \u221e,\n\nthen the collection of probability measures {\u00b5D | D \u2208 D} with densities (with respect to \u03bd)\n\n\u0398 exp(\u2212(cid:96)n(\u03b8; D)) d\u03bd(\u03b8) < \u221e for all D \u2208 D,\n(cid:27)\n\n(cid:26)(cid:18) \u2212\u0001\n\n(cid:19)\n\n2\u2206\n\nfD(\u03b8) \u221d exp\n\n(cid:96)n(\u03b8; D)\n\nsatis\ufb01es \u0001-DP.\n\nIntuitively, (cid:96)n(\u03b8; D) provides a score quantifying the utility of an output \u03b8 for the database D. We use\nthe convention that smaller values of (cid:96)n(\u03b8; D) provide more utility. So, the exponential mechanism\nplaces more mass near the minimizers of (cid:96), and less mass the higher the value of (cid:96)n(\u03b8; D).\n\n3 The K-Norm Gradient Mechanism\nIn Section 2 we considered an arbitrary measure space, (\u03b8,F, \u03bd), when de\ufb01ning DP and the exponen-\ntial mechanism. However, here we focus on Rd. The KNG mechanism cannot be de\ufb01ned to quite\nthe generality of the exponential mechanism since we require enough structure on the parameter\nspace to de\ufb01ne a gradient. Most applications focus on Euclidean spaces, so this is not a major\npractical concern, but there could be implications for more complicated nonlinear, discrete, or in\ufb01nite\ndimensional settings.\nTheorem 3.1 (K-Norm Gradient Mechanism (KNG)). Let \u0398 \u2282 Rd be a convex set, (cid:107)\u00b7(cid:107)K be a norm\non Rd, and \u03bd be a \u03c3-\ufb01nite measure on \u0398. Let {(cid:96)n(\u03b8; D) : \u0398 \u2192 R | D \u2208 Dn} be a collection of\n\n3\n\n\fmeasurable functions, which are differentiable \u03bd almost everywhere. We say that this collection has\nsensitivity \u2206 : \u0398 \u2192 R+, if\n\n(cid:107)\u2207(cid:96)n(\u03b8; D) \u2212 \u2207(cid:96)n(\u03b8; D(cid:48))(cid:107)K \u2264 \u2206(\u03b8) < \u221e,\n\nfor all adjacent D, D(cid:48) and \u03bd-almost all \u03b8. If(cid:82)\n(cid:19)\n\n\u2206(\u03b8)(cid:107)\u2207(cid:96)n(\u03b8; D)(cid:107)K) d\u03bd(\u03b8) < \u221e for all\n(cid:21)\nD \u2208 D, then the collection of probability measures {\u00b5D | D \u2208 D} with densities (with respect to \u03bd)\n\n\u0398 exp(\u2212 1\n\n(cid:20)(cid:18) \u2212\u0001\n\nfD(\u03b8) \u221d exp\n\n(cid:107)\u2207(cid:96)n(\u03b8; D)(cid:107)K\n\nsatis\ufb01es \u0001-DP.\n\nProof. Set(cid:101)(cid:96)n(\u03b8; D) = \u2206(\u03b8)\u22121(cid:107)\u2207(cid:96)n(\u03b8; D)(cid:107)K. Then(cid:101)(cid:96) has sensitivity 1. By Proposition 2.2, the\n\n2\u2206(\u03b8)\n\ndescribed mechanism satis\ufb01es \u0001-DP.\n\nOne advantage of this approach over the traditional exponential mechanism is that the sensitivity\ncalculation is often simpler (e.g. quantile regression, subsection 4.5). However, it also has the same\nintuition as the exponential mechanism. In particular, the optimum, \u02c6\u03b8, occurs when \u2207(cid:96)n(\u02c6\u03b8) = 0, thus\nwe want to promote solutions that make the gradient close to 0, and discourage ones that make the\ngradient far from 0. These concepts are closely related to m-estimators, z-estimators, and estimating\nequations [van der Vaart, 2000, Chapter 5].\nSince KNG utilizes the gradient, it links in nicely to optimization methods such as gradient descent.\nHowever, it could also suffer from some of the same challenges as gradient descent. Namely, if\nthe objective function has multiple local minima, then KNG will promote output near each these\npoints. For this reason, a great deal of care should be taken with KNG when applying to non-convex\nobjective functions, such as \ufb01tting neural networks [Gori and Tesi, 1992].\n\n3.1 Asymptotic Properties\n\nWhile \ufb02exibility of a mechanism is an important concern, ultimately the utility of the output is of\nprimary importance. Awan et al. [2019] showed that for a large class of objective functions, the\nexponential mechanism introduces noise of magnitude Op(n\u22121/2), where n is the sample size. For\nmany statistical problems the non-private error rate is also Op(n\u22121/2) [van der Vaart, 2000, Chapter\n5], meaning that the exponential mechanism introduces noise that is not asymptotically negligible.\nUnder similar assumptions, we show in Theorem 3.2 that KNG has aymptotic error Op(n\u22121), which\nis asymptotically negligible compared to the statistical error. In fact, Theorem 3.2 shows that the\nnoise introduced is asymptotically from a K-norm mechanism [Hardt and Talwar, 2010, Awan and\nSlavkovi\u00b4c, 2018], which generalizes the Laplace mechanism.\nThe assumptions in Theorem 3.2 are chosen to capture a large class of common loss functions,\nwhich include many convex empirical risk functions and log-likelihood functions. Mathematically,\nthe assumption that (cid:96) is twice-differentiable and strongly convex allow us to use a one term Taylor\nexpansion of \u2207(cid:96) about \u02c6\u03b8, and guarantee that the integrating constants converge. The proof of Theorem\n3.2 is found in the Supplementary Materials.\nTheorem 3.2 (Utility of KNG). Let \u0398 \u2282 Rd be a convex set, (cid:107)\u00b7(cid:107)K a norm on Rd, \u03bd a \u03c3-\ufb01nite measure\nom \u0398, and (cid:96)n(\u03b8) := (cid:96)n(\u03b8; D) be a sequence of objective functions which satisfy the assumptions of\nTheorem 3.1, with sensitivity \u2206(\u03b8). We further assume that\n\n1. n\u22121(cid:96)n(\u03b8) are twice differentiable (almost everywhere) convex functions and there exists a\n\n\ufb01nite \u03b1 > 0 such that n\u22121Hn(\u03b8) has eigenvalues greater than \u03b1. for all n and \u03b8 \u2208 \u0398;\n\n2. the minimizers satisfy \u02c6\u03b8 \u2192 \u03b8(cid:63) \u2208 Rd and n\u22121Hn(\u02c6\u03b8) \u2192 \u03a3\u22121 where \u03a3 is a d \u00d7 d positive\n\nde\ufb01nite matrix;\n\n3. \u2206(\u03b8) is continous in \u03b8, constant in n, and there exists \u2206 > 0 such that \u2206 \u2264 \u2206(\u03b8).\n\nAssume the base measure, \u03bd, has a bounded, differentiable density g(\u03b8) (with respect to Lebesgue\nmeasure) which is strictly positive in a neighborhood of \u03b8(cid:63). Then the sanitized value \u02dc\u03b8 drawn from\nthe KNG mechanism with privacy parameter \u0001 is asymptotically K-norm. That is, the density of\n\n4\n\n\f(cid:18) \u2212\u0001\n\n2\u2206(\u03b8\u2217)\n\n(cid:19)\n\n.\n\nZ = n(\u02dc\u03b8 \u2212 \u02c6\u03b8) converges to a K-norm distribution, with density (wrt \u03bd) proportional to f (z) \u221d\nexp\n\n(cid:107)\u03a3\u22121z(cid:107)K\n\nThe proof of the CLT for the exponential mechanism in Awan et al. [2019], as well as the proof of\nTheorem 3.2, both rely on a Taylor expansion of the objective function. In both cases, it is assumed\nthat the Hessian converges, when scaled by n, to a positive de\ufb01nite matrix. However, using the\noriginal objective function requires two derivatives before the Hessian appears in the Taylor expansion,\nwhereas the use of the gradient only requires one derivative. The consequence of this is that the\ntraditional exponential mechanism results in a quadratic numerator inside the exponent, whereas\nKNG has a (normed) linear numerator. Asymptotically, this gives an Op(n\u22121/2) Gaussian noise for\nthe exponential mechanism and an Op(n\u22121) K-norm noise for KNG. Geometrically, it seems that\nthe use of an objective function which behaves linearly (in absolute value) near the optimum, rather\nthan quadratic, results in better asymptotic utility. By using the normed-gradient, we construct an\nobjective function with this property.\nThe assumptions in Theorem 3.2 are very similar to the assumptions for the CLT in Awan et al. [2019].\nSo, whenever these properties hold, we know that KNG results in an Op(n\u22121) privacy noise whereas\nthe exponential mechanism is Op(n\u22121/2). To further emphasize the importance of this result, we\nnote that the magnitude of the noise introduced for privacy can have a substantial impact on the\nsample complexity. Asymptotically, KNG requires exactly the same sample size as the non-private\nestimator, whereas the exponential mechanism requires a constant > 1 multiple of the non-private\nsample size to achieve the same accuracy.\nAs we see in Section 4, in the problem of quantile regression the assumptions of Theorem 3.2 do not\nhold, meaning that while we guarantee privacy in that setting, we can\u2019t guarantee the utility of the\nestimator. However, we see in Figure 2 that KNG still introduces asymptotically negligible noise,\nsuggesting that the assumptions of Theorem 3.2 can likely be weakened to accomodate a larger class\nof objective functions.\nRemark 3.3. Based on the discussion in Section 1, a result similar to 3.2 may hold for objective\nperturbation as well. The main issue is dealing with the change of variables factor | det Hn(\u03b8)|,\nwhich may or may not contribute to the asymptotic form. We suspect that when both KNG and\nobjective perturbation are applicable (e.g. linear regression, see subsection 4.3), they will have similar\nperformance. However, as KNG does not require a second derivative (or convexity), it is applicable\nin more settings than objective perturbation (e.g. quantile regression, see subsection 4.5).\n\n4 Examples\n\n4.1 Mean Estimation\n\nMean estimation is one of the simplest statistical tasks, and one of the \ufb01rst to be solved in DP.\nAssuming bounds on the data, the mean can be estimated by adding Laplace noise [Dwork et al.,\n2006]. Recently there has been some work developing statistical tools for the mean under differential\nprivacy, such as con\ufb01dence intervals in the normal model [Karwa and Vadhan, 2017] and hypothesis\ntests for Bernouilli data [Awan and Slavkovi\u00b4c, 2018]. We show that KNG recovers the K-norm\nmechanism when estimating the mean, a generalization of the Laplace mechanism.\nLet x1, . . . , xn \u2208 Rd, which we assume are drawn from some population with mean \u03b8\u2217. To estimate\n\u03b8\u2217, we use the sum of squares as our objective function:\n\n(cid:96)n(\u03b8; D) =\n\n(cid:107)xi \u2212 \u03b8(cid:107)2\n\n2\n\nand\n\n\u2207(cid:96)n(\u03b8; D) = \u22122\n\n(xi \u2212 \u03b8) = \u22122n(\u00afx \u2212 \u03b8).\n\ni=1\n\ni=1\n\nTurning to the sensitivity, if we assume that there exists a constant r such that (cid:107)xi(cid:107)K \u2264 r < \u221e\nfor some norm (cid:107) \u00b7 (cid:107)K, then the sensitivity of the gradient is (cid:107)\u2207(cid:96)n(\u03b8; D) \u2212 \u2207(cid:96)n(\u03b8; D(cid:48))(cid:107)K =\n2(cid:107)x1 \u2212 x(cid:48)\n1(cid:107)K \u2264 2r. Thus the mechanism becomes fn(\u03b8) \u221d exp{\u2212(n\u0001/(4r))(cid:107)\u00afx \u2212 \u03b8(cid:107)K} , which is\nexactly a K-norm mechanism [Hardt and Talwar, 2010]. So \u02dc\u03b8 \u2212 \u00afx has mean 0 and standard deviation\nOp(n\u22121). Thus, the noise added for privacy is asymptotically negligible compared to the statistical\nerror Op(n\u22121/2).\n\n5\n\nn(cid:88)\n\nn(cid:88)\n\n\fRemark 4.1. Because the KNG results in a location family in this case, the integrating constant does\nnot depend on the data. So, we do not need to divide \u0001 by 2 in the density, and may instead draw from\n\n(cid:9), which is how the K-norm mechanism is normally stated.\n\nfn(\u03b8) \u221d exp(cid:8) n\u0001\n\n2r (cid:107)\u00afx \u2212 \u03b8(cid:107)K\n\n4.2 Linear Regression\n\nThere has been a great deal of work developing DP methods for linear regression [Zhang et al.,\n2012, Song et al., 2013, Dwork and Lei, 2009, Chaudhuri et al., 2011, Kifer et al., 2012, Sheffet,\n2017]. In this section, we detail how KNG can be used to estimate the coef\ufb01cients in a linear\nregression model. We observe pairs of data (xi, yi), where yi \u2208 R and xi \u2208 Rd, which we assume\nare modeled as yi = x(cid:62)\ni \u03b8\u2217 + ei, where the errors are iid with mean zero and are uncorrelated with\nx. Our goal is to estimate \u03b8\u2217. To implement KNG, we assume that the data has been pre-processed\nsuch that \u22121 \u2264 xi \u2264 1 and \u22121 \u2264 yi \u2264 1 for all i = 1, . . . , n. We also assume that (cid:107)\u03b8\u2217(cid:107)1 \u2264 B.\nThe usual non-private estimator for \u03b8\u2217 is the least-squares, which minimizes the objective function\n\ni=1(yi \u2212 x(cid:62)\n\ni \u03b8)2. KNG requires a bound on the sensitivity of \u2207(cid:96)n:\n\n(cid:96)(\u03b8; D) =(cid:80)n\n\n(cid:107)\u2207(cid:96)n(\u03b8; D) \u2212 \u2207(cid:96)n(\u03b8; D(cid:48))(cid:107) \u2264 sup\n\n4(cid:107)(y1 \u2212 x(cid:62)\n\n4(1 + B)(cid:107)x1(cid:107).\n\nBy using the (cid:96)\u221e norm, we get the tightest bound, since (cid:107)x1(cid:107)\u221e \u2264 1. KNG samples from the density\n\n8(1 + B)\n\nfn(\u03b8) \u221d exp\n\n(yi \u2212 x(cid:62)\nwith respect to the uniform measure on \u0398 = {\u03b8 | (cid:107)\u03b8(cid:107)1 \u2264 B}.\nRemark 4.2. Alternative sensitivity bounds can be obtained by choosing other bounds on x and y.\nThe bound on \u03b8\u2217 can be removed entirely, allowing \u2206 to depend on \u03b8. In that case, a nontrivial base\nmeasure will be required as the resulting density is not integrable with respect to Lebesgue measure.\nWe prefer to use the given sensitivity bound as it allows a fairer comparison against the exponential\nmechanism and objective perturbation in subsection 4.3.\n\ni \u03b8)x(cid:62)\n\n(1)\n\n1\n\n,\n\ny1,x1,\u03b8\n\n(cid:32) \u2212\u0001\n\n(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n(cid:88)\n\ni=1\n\n1 \u03b8)x1(cid:107) = sup\n(cid:33)\n\nx1\n\n(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)\u221e\n\nFigure 1: Simulation comparing the non-private\nMLE, exponential mechanism, objective perturba-\ntion, and KNG for linear regression.\n\nFigure 2: Simulation comparing the non-private,\nexponential mechanism, and KNG for quantile\nregression.\n\n4.3 Linear Regression Simulation\n\nIn this section, we examine the \ufb01nite sample performance of the KNG mechanism on linear regression\ncompared to the exponential mechanism and objective perturbation mechanism. KNG samples from\nthe density (1), the exponential mechanism samples from\n\n(cid:33)\n\nfn(\u03b8) \u221d exp\n\n(yi \u2212 x(cid:62)\n\ni \u03b8)2\n\n,\n\n(cid:32)\n\nn(cid:88)\n\n\u2212\u0001\n\n2(1 + B)2\n\ni=1\n\n6\n\n\u22123\u22122\u2212101nlog_10( Average L2 Distance to Truth )1001000100001e+051e+061e+07NonPrivateExpMechObjPertKNG\u22122.5\u22122.0\u22121.5\u22121.0\u22120.50.00.5nlog_10( Average L2 Distance to Truth )101001000100001e+05NonPrivateExpMechKNG\f(cid:16)\u2212 \u0001\n\n(cid:17)\nand objective perturbation draws a random vector b from the density f (b) \u221d exp\n8(1+B)(cid:107)b(cid:107)\u221e\n,\n2 \u03b8(cid:62)\u03b8 + \u03b8(cid:62)b, where\nand then \ufb01nds the optimum of the modi\ufb01ed objective: arg min(cid:107)\u03b8(cid:107)1\u22641 (cid:96)n(\u03b8; D) + \u03b3\n\u03b3 = (exp(\u0001/2) \u2212 1)\u22121(2d) and d is the dimension of the xi\u2019s. For all three mechanisms we assume\nthe bound on (cid:107)\u03b8\u2217(cid:107)1 is B = 1. Details on these mechanisms for linear regression can be found in the\nSupplementary Materials.\nFor the simulations the true regression vector \u03b8\u2217 \u2208 R12 is \u03b8\u2217 = (0,\u22121,\u22121+2/11,\u22121+4/11, . . . , 1\u2212\n2/11), and so d = 12. For each n in 102, 103, 104, . . . , 107 we run 100 replicates of Algorithm 1\nat \u0001 = 1. For KNG and exponential mechanism, we draw samples using a one-at-a-time MCMC\nprocedure with 10000 steps.\nAt the end, we compute the average distance over the 100 replicates for each mechanism and for each\nsample size n. The results are plotted in Figure 1, taking the base 10 log of both axes. At each n\nvalue and for each mechanism, the Monte Carlo standard errors are between 0.01380 and 0.02729, in\nterms of the log-scale used in the plot. The bene\ufb01t of plotting in this fashion is that it makes it easier\nto understand the asymptotic behavior of each estimator.\nSince we know that the estimation error of the non-private MLE is error = Cn\u22121/2, taking the\nlog of both sides shows that the convergence should appear as a straight line with slope \u22121/2:\nlog(error) = \u2212 1\nAs Awan et al. [2019] showed, the asymptotic estimation error of the exponential mechanism is\nerror = Kn\u22121/2, where K is a constant greater than C. Taking the log of both sides gives another\nline with slope \u22121/2, but with a higher intercept: log(error) = \u2212 1\n2 log(n) + log(K), which we see\nin red in Figure 1.\nOn the other hand, for KNG and objective perturbation (based on Remark 3.3) , the asymptotic\nestimation error is error = Cn\u22121/2 + Kn\u22121, which when logged shows that for larger n, the curve\napproaches the line of the non-private estimation error from above: log(error) = \u2212 1\n2 log(n) +\nlog(C + Kn\u22121/2), which is also con\ufb01rmed in Figure 1.\n\n2 log(n) + log(C), which is the black line in Figure 1.\n\nAlgorithm 1 Regression Simulation\nINPUT: n, \u0001, d, \u03b8\u2217.\n1: Generate X \u2208 Rn\u00d7d such that Xi,1 = 1 and Xij\n2: Generate independent errors ei \u223c N (0, 1) for i = 1, . . . , n.\n3: Compute the responses Yi = Xi\u03b8\u2217 + ei.\n4: Set R = maxi |Yi|.\n5: Set Y (cid:48)\n6: Use X and Y (cid:48) to estimate the regression coef\ufb01cient via the non-private estimator, and each DP mechanism.\n7: Multiply the estimates by R to estimate \u03b8\u2217.\n8: Compute the euclidean distance between the estimate and the true \u03b8\u2217 for each estimator.\nOUTPUT: Average distances of the estimates to the true \u03b8\u2217.\n\niid\u223c U (\u22121, 1) for i = 1, . . . , n and j = 2, . . . , d.\n\ni = Yi/R.\n\nusing the empirical risk function (cid:96)n(\u03b8; D) =(cid:80)n\ntypically the euclidean norm is used. Now, our objective becomes (cid:96)n(\u03b8; D) =(cid:80)n\n\n4.4 Median Estimation\nJust as in the mean estimation problem, we observe D = (x1, . . . , xn), where xi \u2208 Rd, and our\ngoal is to estimate the population median. In the case when d = 1, the median can be estimated\ni=1 |xi \u2212 \u03b8|. In general for d \u2265 1, we are estimating\nthe geometric median [Minsker et al., 2015], which can be expressed as arg minm E(cid:107)X \u2212 m(cid:107), and\ni=1(cid:107)xi \u2212 \u03b8(cid:107). It\nmay be concerning that this objective is not differentiable everywhere, however, KNG only requires\nthat the gradient exist on a set of measure one. The gradient of (cid:107)xi \u2212 \u03b8(cid:107) in our norm\u2019s topology is\ngiven by d(\u03b8, xi) := (cid:107)xi \u2212 \u03b8(cid:107)\u22121(xi \u2212 \u03b8), provided that \u03b8 (cid:54)= xi. Notice that this gives a direction in\nRd since (cid:107)d(\u03b8, xi)(cid:107) = 1. Using the triangle inequality, we see that the sensitivity of the gradient is\nbounded by 2. So the KNG mechanism for the median can be expressed as\n\n(cid:40)\n\n(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) 1\n\nn\n\nn(cid:88)\n\ni=1\n\n(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)\n(cid:41)\n\nfn(\u03b8) \u221d exp\n\n\u2212 \u0001n\n4\n\nd(\u03b8, xi)\n\n.\n\nAgain, the error introduced is Op(n\u22121), which is negligible compared to the statistical error.\n\n7\n\n\f4.5 Quantile Regression\nFor quantile regression as for linear regression, we observe pairs of data (xi, yi), where yi \u2208 R and\nxi \u2208 Rd. We assume that QYi|Xi(\u03c4 ) = X(cid:62)\n(cid:80)n\ni \u03b8\u2217\n\u03c4 , for all i = 1, . . . , n, where QY |X (\u03c4 ) is the conditional\nquantile function of Y given X for 0 < \u03c4 < 1, and \u03b8\u2217 \u2208 Rp [Hao et al., 2007]. For a given \u03c4, \u03b8\u2217\n\u03c4 can\ni=1 \u03c1\u03c4 (yi \u2212 x(cid:62)\ni \u03b8), where \u03c1\u03c4 (z) = (\u03c4 \u2212 1)zI(z \u2264 0) + \u03c4 zI(z > 0)\nbe estimated as \u02c6\u03b8\u03c4 = arg min\u03b8\n(cid:88)\nis called the tiled absolute value function [Koenker and Hallock, 2001]. So, our objective function is\n\n(cid:88)\n\n(yi \u2212 x(cid:62)\n\ni \u03b8) + \u03c4\n\n(yi \u2212 x(cid:62)\n\ni \u03b8),\n\n(cid:96)n(\u03b8; D) = (\u03c4 \u2212 1)\n\nwith gradient (almost everywhere)\n\n\u2207(cid:96)n(\u03b8; D) = (\u03c4 \u2212 1)\n\n(\u2212xi) + \u03c4\n\nyi\u2264x(cid:62)\ni \u03b8\n\nyi\u2264x(cid:62)\ni \u03b8\n\n(cid:88)\n\uf8f1\uf8f2\uf8f3 \u2212\u0001n\n\n4(1 \u2212 \u03c4 )CX\n\nyi>x(cid:62)\ni \u03b8\n\n(cid:88)\n(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)\u2212\u03c4\n\n1\nn\n\nn(cid:88)\n\ni=1\n\nyi>x(cid:62)\ni \u03b8\n\n(\u2212xi) = \u2212\u03c4\n\nn(cid:88)\n\ni=1\n\n(cid:88)\n\nxi +\n\nxi.\n\nyi\u2264x(cid:62)\ni \u03b8\n\n(cid:88)\n\nxi +\n\n1\nn\n\nxi\n\nyi\u2264x(cid:62)\ni \u03b8\n\n\uf8fc\uf8fd\uf8fe .\n(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)\n\nWe bound the sensitivity as \u2206 = 2(1 \u2212 \u03c4 )CX, where supx1(cid:107)x1(cid:107) \u2264 CX. Then KNG samples from\n\nfn(\u03b8) \u221d exp\n\n(2)\n\nWe see a few nice bene\ufb01ts of the KNG method in this example. If we were to use (cid:96)n directly\nin the exponential mechanism, then not only would we expect worse asymptotic performance (as\ndemonstrated in subsection 4.5.1), but we see that the sensitivity calculation for the gradient only\nrequires a bound on X, whereas the sensitivity of (cid:96)n requires bounds on Y , X, and \u03b8\u2217. Furthermore,\nthe objective perturbation mechanism cannot be used in this setting, because (cid:96) is not strongly convex,\nwhereas the proofs for objective perturbation [Chaudhuri and Monteleoni, 2009, Chaudhuri et al.,\n2011, Kifer et al., 2012, Awan and Slavkovi\u00b4c, 2018] all require strong convexity. In fact, the Hessian\nof (cid:96)n is zero almost everywhere making the objective perturbation inapplicable.\nFinally note that if we are only interested in estimating the \u03c4 th quantile of a set of real numbers\nY1, . . . , Yn, we could set Xi = 1 for all i = 1, . . . , n, in which case KNG samples from\n\n(cid:26) \u2212\u0001n\n\n4(1 \u2212 \u03c4 )\n\n(cid:12)(cid:12)(cid:12)\u03c4 \u2212 \u02c6F (\u03b8; Y )\n\n(cid:12)(cid:12)(cid:12)(cid:27)\n\nfn(\u03b8) \u221d exp\n\n.\n\n(3)\n\nIn fact, this is the Private Quantile algorithm proposed by Smith [2011], who also establish strong\nutility guarantees for the algorithm; this exercise demonstrates that KNG could provide, or at least\ncontribute to, a more uni\ufb01ed framework for developing ef\ufb01cient privacy mechanisms.\n\n4.5.1 Quantile Regression Simulation\n\nIn this section, we examine the empirical performance of the KNG mechanism on quantile regression\ncompared to the exponential mechanism. KNG samples from the density (2) using the (cid:107)\u00b7(cid:107)\u221e norm\nand setting CX = 1, and the exponential mechanism samples from\n\n(cid:26)\n\nfn(\u03b8) \u221d exp\n\n\u2212\u0001\n\n4 max{\u03c4, 1 \u2212 \u03c4}(1 + B)\n\n(cid:96)n(\u03b8; D)\n\n.\n\n(cid:27)\n\nWe assume, as in subsection 4.3 that B = 1. Details on the exponential mechanism can be found\nin the Supplementary Materials. Note that objective perturbation cannot be used in this setting, as\ndiscussed in subsection 4.5.\n1/2 = (0,\u22121). For\nFor the simulations, we use \u03c4 = 1/2 and the true regression vector \u03b8\u2217\neach n in 101, 102, . . . , 105 we run 100 replicates of Algorithm 1 at \u0001 = 1. Samples from KNG and\nthe exponential mechanism are obtained using 1000 steps of a one-at-a-time MCMC algorithm. At\nthe end, we compute the average distance over the 100 replicates for each estimator and for each\nsample size n. The results are plotted in Figure 1, taking the base 10 log of both axes. At each n\nvalue and for each mechanism, the monte carlo standard errors are between 0.04403 and 0.06028, in\nterms of the log-scale.\n\n1/2 \u2208 R2 is \u03b8\u2217\n\n8\n\n\fWe see in \ufb01gure 2 that the non-private estimate appears as a straight line with slope \u22121/2, re\ufb02ecting\nthe fact that its estimation error is Op(n\u22121/2). We also see that the exponential mechanism approaches\na line with slope \u22121/2, but with a higher intercept, re\ufb02ecting that it has increased asymptotic variance.\nLast, we see that the error of KNG approaches the error line of the non-private estimator, suggesting\nthat KNG has the same asymptotic rate as the non-private estimator.\nWhile the utility guarantees of Theorem 3.2 do not apply in this setting, as the objective function is not\nstrongly convex, the santized estimates still achieve \u0001-DP and we see from Figure 2 that, empirically,\nKNG introduces op(n\u22121/2) error in this setting as well. This suggests that the assumptions in\nTheorem 3.2 can likely be weakened, and KNG in fact produces ef\ufb01cient mechanisms for an even\nbroader set of problems than Theorem 3.2 prescribes.\n\n5 Conclusions\n\nIn this paper we presented a new privacy mechanism, KNG, that maintains much of the \ufb02exbility of\nthe exponential mechanism, while having substantially better utility guarantees. These guarantees\nare similar to those provided by objective perturbation, but privacy can be achieved with far fewer\nstructural assumptions. A major draw back of the mechanism is the same as for gradient descent,\nwhich can have trouble with local minima or saddle points. Two interesting open questions concern\nthe \ufb01nite sample ef\ufb01ciency of KNG vs objective perturbation and if KNG can be adapted or combined\nwith other methods to better handle multiple minima.\nWe also believe that KNG has a great deal of potential for handling in\ufb01nite dimensional and nonlinear\nproblems. For example, parameter spaces consisting of Hilbert spaces or Riemannian manifolds\nhave structures that allow for the computation of gradients, and which might be amenable to KNG.\nWith Riemannian manifolds, the gradient is often viewed as a linear mapping over tangent spaces,\nwhile in Hilbert spaces, the gradient is often treated as a linear functional. A major advantage of\nKNG over other mechanisms is the direct incorporation of a general K-norm. Awan et al. [2019]\nshowed that the exponential mechanism has major problems over function spaces, which are of\ninterest in nonparametric statistics. These issues could potentially be alleviated by KNG with a\ncareful choice of norm. Many interesting challenges remain in data privacy, especially if there is\nadditional complicated structure in the parameters or data.\nKNG has strong connections with prior DP mechanisms, especially the exponential mechanism\nand objective perturbation. Indeed, like nearly every privacy mechanism, KNG can be phrased as\nvery particular type of exponential mechanism, however this doesn\u2019t provide insight into why KNG\nachieves better statistical properties. In particular, a key point is to consider the objective function\nthat motivated the original statistical summary, which, when used with KNG produces sanitized\nestimators with better statistical performance than the classic implementation of the exponential\nmechanism.\nOne downside of KNG is the issue of sampling, which is similar to the exponential mechanism\nin that sampling from these distributions is, in general, non-trivial. We show that for mean and\nquantile estimation, KNG results in distributions that are ef\ufb01ciently sampled. However, for linear\nand quantile regression, we used a one-at-a-time MCMC procedure (also used for exponential\nmechanism). Just like sampling from an posterior distribution, developing a convenient sampling\nscheme is case-by-case, but often a simple MCMC procedure works well in practice.\n\nAcknowledgements\n\nThis research was supported in part by NSF DMS 1712826, NSF SES 1853209, and NSF SES-153443\nto The Pennsylvania State University. The \ufb01rst author is also grateful for the hospitality of the Simons\nInstitute for the Theory of Computing at UC Berkeley.\n\nReferences\nCharu C Aggarwal and S Yu Philip. A general survey of privacy-preserving data mining models and\n\nalgorithms. In Privacy-preserving data mining, pages 11\u201352. Springer, 2008.\n\n9\n\n\fJordan Awan and Aleksandra Slavkovi\u00b4c. Differentially private uniformly most powerful tests for\nbinomial data. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and\nR. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 4208\u20134218.\nCurran Associates, Inc., 2018.\n\nJordan Awan and Aleksandra Slavkovi\u00b4c. Structure and sensitivity in differential privacy: Comparing\n\nk-norm mechanisms. ArXiv e-prints, January 2018. Under Review.\n\nJordan Awan, Ana Kenney, Matthew Reimherr, and Aleksandra Slavkovi\u00b4c. Bene\ufb01ts and pitfalls of\nthe exponential mechanismwith applications to hilbert spaces and functional pca. In Proceedings\nof the 36th International Conference on International Conference on Machine Learning, ICML\u201919,\npages 374\u2013384. JMLR.org, 2019.\n\nAvrim Blum, Katrina Ligett, and Aaron Roth. A learning theory approach to noninteractive database\n\nprivacy. Journal of the ACM (JACM), 60(2):12, 2013.\n\nMark Bun and Thomas Steinke. Concentrated differential privacy: Simpli\ufb01cations, extensions, and\n\nlower bounds. In TCC, 2016.\n\nCl\u00e9ment L Canonne, Gautam Kamath, Audra McMillan, Adam Smith, and Jonathan Ullman. The\nstructure of optimal private tests for simple hypotheses. In Proceedings of the 51st Annual ACM\nSIGACT Symposium on Theory of Computing, pages 310\u2013321. ACM, 2019.\n\nKamalika Chaudhuri and Claire Monteleoni. Privacy-preserving logistic regression. In D. Koller,\nD. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing\nSystems 21, pages 289\u2013296. Curran Associates, Inc., 2009.\n\nKamalika Chaudhuri, Claire Monteleoni, and D. Sarwate. Differentially private empirical risk\n\nminimization. In Journal of Machine Learning Research, volume 12, pages 1069\u20131109, 2011.\n\nKamalika Chaudhuri, Anand D. Sarwate, and Kaushik Sinha. A near-optimal algorithm for\ndifferentially-private principal components. Journal of Machine Learning Research, 14(1):2905\u2013\n2943, January 2013. ISSN 1532-4435.\n\nJohn C Duchi, Michael I Jordan, and Martin J Wainwright. Local privacy and statistical minimax\nrates. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 429\u2013438.\nIEEE, 2013.\n\nCynthia Dwork and Jing Lei. Differential privacy and robust statistics. In Proceedings of the Forty-\n\ufb01rst Annual ACM Symposium on Theory of Computing, STOC \u201909, pages 371\u2013380, New York, NY,\nUSA, 2009. ACM. ISBN 978-1-60558-506-2. doi: 10.1145/1536414.1536466.\n\nCynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Found.\nISSN 1551-305X. doi:\n\nTrends Theor. Comput. Sci., 9(3–4):211\u2013407, August 2014.\n10.1561/0400000042.\n\nCynthia Dwork and Guy N. Rothblum. Concentrated differential privacy. CoRR, abs/1603.01887,\n\n2016.\n\nCynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating Noise to Sensitivity\nin Private Data Analysis, pages 265\u2013284. Springer Berlin Heidelberg, Berlin, Heidelberg, 2006.\nISBN 978-3-540-32732-5. doi: 10.1007/11681878_14.\n\nCynthia Dwork, Adam Smith, Thomas Steinke, and Jonathan Ullman. Exposed! a survey of\nattacks on private data. Annual Review of Statistics and Its Application, 4(1):61\u201384, 2017. doi:\n10.1146/annurev-statistics-060116-054123.\n\nArik Friedman and Assaf Schuster. Data mining with differential privacy. In Proceedings of the 16th\nACM SIGKDD international conference on Knowledge discovery and data mining, pages 493\u2013502.\nACM, 2010.\n\nMarco Gori and Alberto Tesi. On the problem of local minima in backpropagation. IEEE Transactions\n\non Pattern Analysis & Machine Intelligence, (1):76\u201386, 1992.\n\n10\n\n\fLingxin Hao, Daniel Q Naiman, and Daniel Q Naiman. Quantile regression. Number 149. Sage,\n\n2007.\n\nMoritz Hardt and Kunal Talwar. On the geometry of differential privacy. In Proceedings of the\nForty-second ACM Symposium on Theory of Computing, STOC \u201910, pages 705\u2013714, New York,\nNY, USA, 2010. ACM. ISBN 978-1-4503-0050-6. doi: 10.1145/1806689.1806786.\n\nVishesh Karwa and Salil P. Vadhan. Finite sample differentially private con\ufb01dence intervals. CoRR,\n\nabs/1711.03908, 2017.\n\nAssimakis Kattis and Aleksandar Nikolov. Lower Bounds for Differential Privacy from Gaussian\nWidth. In Boris Aronov and Matthew J. Katz, editors, 33rd International Symposium on Compu-\ntational Geometry (SoCG 2017), volume 77 of Leibniz International Proceedings in Informatics\n(LIPIcs), pages 45:1\u201345:16, Dagstuhl, Germany, 2017. Schloss Dagstuhl\u2013Leibniz-Zentrum fuer\nInformatik. ISBN 978-3-95977-038-5. doi: 10.4230/LIPIcs.SoCG.2017.45.\n\nD Kifer, A Smith, and A Thakurta. Private convex empirical risk minimization and high-dimensional\n\nregression. Journal of Machine Learning Research, 1:1\u201341, 01 2012.\n\nDaniel Kifer and Bing-Rong Lin. An axiomatic view of statistical privacy and utility. Journal of\n\nPrivacy and Con\ufb01dentiality, 4(1):5\u201349, 2012.\n\nRoger Koenker and Kevin F Hallock. Quantile regression. Journal of economic perspectives, 15(4):\n\n143\u2013156, 2001.\n\nJulia Lane, Victoria Stodden, Stefan Bender, and Helen Nissenbaum. Privacy, big data, and the\n\npublic good: Frameworks for engagement. Cambridge University Press, 2014.\n\nAshwin Machanavajjhala and Daniel Kifer. Designing statistical privacy for your data. Commun.\n\nACM, 58:58\u201367, 2015.\n\nFrank McSherry and Kunal Talwar. Mechanism design via differential privacy. In Proceedings\nof the 48th Annual IEEE Symposium on Foundations of Computer Science, FOCS \u201907, pages\n94\u2013103, Washington, DC, USA, 2007. IEEE Computer Society.\nISBN 0-7695-3010-9. doi:\n10.1109/FOCS.2007.41.\n\nKentaro Minami, Hiromi Arai, Issei Sato, and Hiroshi Nakagawa. Differential privacy without\nsensitivity. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances\nin Neural Information Processing Systems 29, pages 956\u2013964. Curran Associates, Inc., 2016.\n\nStanislav Minsker et al. Geometric median and robust estimation in banach spaces. Bernoulli, 21(4):\n\n2308\u20132335, 2015.\n\nOr Sheffet. Differentially private ordinary least squares.\n\nIn Doina Precup and Yee Whye Teh,\neditors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of\nProceedings of Machine Learning Research, pages 3105\u20133114, International Convention Centre,\nSydney, Australia, 06\u201311 Aug 2017. PMLR.\n\nAdam Smith. Privacy-preserving statistical estimation with optimal convergence rates. In Proceedings\nof the Forty-third Annual ACM Symposium on Theory of Computing, STOC \u201911, pages 813\u2013822,\nNew York, NY, USA, 2011. ACM. ISBN 978-1-4503-0691-1. doi: 10.1145/1993636.1993743.\n\nShuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. Stochastic gradient descent with\ndifferentially private updates. In in Proceedings of the Global Conference on Signal and Information\nProcessing. IEEE, pages 245\u2013248, 2013.\n\nA.W. van der Vaart. Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathe-\n\nmatics. Cambridge University Press, 2000. ISBN 9781107268449.\n\nYu-Xiang Wang, Stephen E. Fienberg, and Alexander J. Smola. Privacy for free: Posterior sampling\nand stochastic gradient monte carlo. In Proceedings of the 32nd International Conference on Inter-\nnational Conference on Machine Learning - Volume 37, ICML\u201915, pages 2493\u20132502. JMLR.org,\n2015.\n\n11\n\n\fLarry Wasserman and Shuheng Zhou. A statistical framework for differential privacy. JASA, 105:489:\n\n375\u2013389, 2010.\n\nJun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, and Marianne Winslett. Functional mechanism:\nRegression analysis under differential privacy. Proc. VLDB Endow., 5(11):1364\u20131375, July 2012.\nISSN 2150-8097. doi: 10.14778/2350229.2350253.\n\n12\n\n\f", "award": [], "sourceid": 5398, "authors": [{"given_name": "Matthew", "family_name": "Reimherr", "institution": "Pennsylvania State University"}, {"given_name": "Jordan", "family_name": "Awan", "institution": "Penn State University"}]}