NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Reviewer 1
The authors studied the sample complexity of semi-private learning a VC class. The paper is well-written and clear. The first main result is that any VC class can be (agnostically) learned with VC(H)/alpha^2 private data and VC(H)/alpha public data. The algorithm is the same as [BNS13], which studied the realizable case, but the analysis in this paper is novel. The second main result is that when H has infinite littlestone dimension, then any private learner must have at least 1/alpha public sample complexity, therefore the upper bound is nearly tight. The proof is based on a result from a recent STOC paper and a new "public data reduction lemma". As a consequence of the lemma, the authors showed a dichotomy of pure semi-private learning. Overall, this paper provides solid and near-optimal theoretical results on semi-private learning. I suggest acceptance.
Reviewer 2
Originality: The methods used in this work are simple and for the most part standard in the literature of learning theory. The idea of learning \alpha-coverings is nice, and to the best of my knowledge, new. Quality: The statements of the Theorems and Lemma are careful and make sense. Every proof I checked seems correct. I would prefer though if the authors mention more clearly that their results are significant only in the agnostic setting; in the realizable setting I think VC(H)/\alpha sample size is sufficient and the private sample size can be ignored. Clarity: I consider the paper very well written and the exposure sufficiently clear. Significance: The paper seems significant and I can see future research to cite this work and be using their techniques.
Reviewer 3
[Upper bound] Every hypothesis class H can be learned up to excess error α by a pure semi- private algorithm whose private sample complexity is (roughly) VC(H)/α^2 and public sample complexity is (roughly) VC(H)/α. For agnostic learning, VC(H)/α2 examples are necessary even without privacy constraints. The idea of algorithm is to use the public data to construct a finite class H′ that forms a “good approximation” of the original class H, then reduce the problem to DP learning of a finite class. It can be captured via the notion of α-covering. [Lower bound] Assume H has an infinite Littlestone dimension. Then, any approximate semi- private learner for H must have public sample complexity Ω(1/α). The lower bounds boil down to a public-data-reduction lemma which shows that given a semi-private learner whose public sample complexity is << 1/α, transforms it to a completely. Overall, this paper is clear, written well, and has a good contribution. From my view, it is around the conference threshold.