NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Reviewer 1
Originality Although other papers might have noted undesirability of randomness, it seems to be the first paper to formulate the question by providing a metric as to how close a deterministic classifier is to a stochastic classifier. Quality The proofs are generally written very clearly, and even when they are not in the main body, enough explanations are given to make the results pretty intuitive. Overall, approaches themselves and the presentation of the results seem very clean. Clarity The general flow of the paper was smooth: starting with motivating reasons, sketching the extent as to how the problem can be solved, and providing different methods that complement each other, going through the underlying tension in the problem, ... . Also, the presentation of the experiments (e.g. the figures) were very easy to read. Significance: As described above, the paper has studied an interesting problem that is well motivated and provided clean approaches for the problem. Along with theoretical guarantees, the experimental results validate the usefulness of the methods. Typos: -137: there seems to be an extra parenthesis in the equation -490 (Theorem 4 statement): instead of Pr( ...], it should be Pr( ....) or Pr[ ...], right? **** POST REBUTTAL **** The authors have clarified the issues, and I think the paper is still interesting so I will still keep my evaluation the same.
Reviewer 2
This is a well-written and thoughtful paper that introduces the formal study of how best to turn stochastic classifiers into deterministic classifiers. In addition to providing and justifying relevant definitions, it establishes important initial theoretical results and presents new algorithms. This is an excellent paper on an important topic.
Reviewer 3
The paper is easy to follow and the authors try to motivate why deterministic classifiers might be better than stochastic classifiers as they are easier to debug and ‘seem’ more fair and are not susceptible to failures when repeatedly used to classify the same thing. The authors give a discussion about orderliness of the classifiers, i.e. classifiers classifying similar points similarly and try to relate it with group fairness vs individual fairness. The authors make a comment that less orderly classifiers are like to achieve better in group fairness metrics and orderly classifiers are better for individual fairness. I do agree with the second part of the statement but the first half of statement does not seem necessarily true to me and the authors do not motivate why or if this tradeoff actually exist, i.e., why can’t orderly classifiers (in a continuous sense) be better at group fairness? In the experiment section, they compare their hashing based and binning based methods with thresholding method on a fairness task of achieving similar ROC curves for the whole population and the protected class by approximating a stochastic classifier obtained by using techniques from Cotter et al (ALT 2019, JMLR 2019). They show that that the deterministic classifiers obtained perform close to the stochastic one and are better than threshold classifier. They then show that on task of histogram matching, as the numbers of bins increase, the group fairness metrics are better but the orderliness is less. I feel this does not necessarily imply that less orderliness is necessary for group fairness. Overall the paper is gives new and original ideas about using hashing for making deterministic classifiers from stochastic classifiers but I feel the the discussion about fairness and orderliness fails to motivate the significance of the result. Also it’s not easy to interpret how the guarantees compare with the lower bound as no explicit discussion is provided by the authors.