{"title": "Fairness Behind a Veil of Ignorance: A Welfare Analysis for Automated Decision Making", "book": "Advances in Neural Information Processing Systems", "page_first": 1265, "page_last": 1276, "abstract": "We draw attention to an important, yet largely overlooked aspect of evaluating fairness for automated decision making systems---namely risk and welfare considerations. Our proposed family of measures corresponds to the long-established formulations of cardinal social welfare in economics, and is justified by the Rawlsian conception of fairness behind a veil of ignorance. The convex formulation of our welfare-based measures of fairness allows us to integrate them as a constraint into any convex loss minimization pipeline. Our empirical analysis reveals interesting trade-offs between our proposal and (a) prediction accuracy, (b) group discrimination, and (c) Dwork et al's notion of individual fairness. Furthermore and perhaps most importantly, our work provides both heuristic justification and empirical evidence suggesting that a lower-bound on our measures often leads to bounded inequality in algorithmic outcomes; hence presenting the first computationally feasible mechanism for bounding individual-level inequality.", "full_text": "Fairness Behind a Veil of Ignorance:\n\nA Welfare Analysis for Automated Decision Making\n\nHoda Heidari\nETH Z\u00fcrich\n\nhheidari@inf.ethz.ch\n\nClaudio Ferrari\n\nETH Z\u00fcrich\n\nferraric@ethz.ch\n\nKrishna P. Gummadi\n\nMPI-SWS\n\ngummadi@mpi-sws.org\n\nAndreas Krause\n\nETH Z\u00fcrich\n\nkrausea@ethz.ch\n\nAbstract\n\nWe draw attention to an important, yet largely overlooked aspect of evaluating\nfairness for automated decision making systems\u2014namely risk and welfare con-\nsiderations. Our proposed family of measures corresponds to the long-established\nformulations of cardinal social welfare in economics, and is justi\ufb01ed by the Rawl-\nsian conception of fairness behind a veil of ignorance. The convex formulation\nof our welfare-based measures of fairness allows us to integrate them as a con-\nstraint into any convex loss minimization pipeline. Our empirical analysis reveals\ninteresting trade-offs between our proposal and (a) prediction accuracy, (b) group\ndiscrimination, and (c) Dwork et al.\u2019s notion of individual fairness. Furthermore\nand perhaps most importantly, our work provides both heuristic justi\ufb01cation and\nempirical evidence suggesting that a lower-bound on our measures often leads to\nbounded inequality in algorithmic outcomes; hence presenting the \ufb01rst computa-\ntionally feasible mechanism for bounding individual-level inequality.\n\n1\n\nIntroduction\n\nTraditionally, data-driven decision making systems have been designed with the sole purpose of\nmaximizing some system-wide measure of performance, such as accuracy or revenue. Today, these\nsystems are increasingly employed to make consequential decisions for human subjects\u2014examples\ninclude employment [Miller, 2015], credit lending [Petrasic et al., 2017], policing [Rudin, 2013],\nand criminal justice [Barry-Jester et al., 2015]. Decisions made in this fashion have long-lasting\nimpact on people\u2019s lives and\u2014absent a careful ethical analysis\u2014may affect certain individuals or\nsocial groups negatively [Sweeney, 2013; Angwin et al., 2016; Levin, 2016]. This realization has\nrecently spawned an active area of research into quantifying and guaranteeing fairness for machine\nlearning [Dwork et al., 2012; Kleinberg et al., 2017; Hardt et al., 2016].\nVirtually all existing formulations of algorithmic fairness focus on guaranteeing equality of some\nnotion of bene\ufb01t across different individuals or socially salient groups. For instance, demographic\nparity [Kamiran and Calders, 2009; Kamishima et al., 2011; Feldman et al., 2015] seeks to equalize\nthe percentage of people receiving a particular outcome across different groups. Equality of op-\nportunity [Hardt et al., 2016] requires the equality of false positive/false negative rates. Individual\nfairness [Dwork et al., 2012] demands that people who are equal with respect to the task at hand\nreceive equal outcomes. In essence, the debate so far has mostly revolved around identifying the right\nnotion of bene\ufb01t and a tractable mathematical formulation for equalizing it.\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\fFigure 1: Predictive model A assigns the same bene\ufb01t of 0.8 to everyone; model C assigns the\nsame bene\ufb01t of 1 to everyone; model B results in bene\ufb01ts (0.5, 0.6, 0.8, 0.9, 1.2), and model D,\n(0.78, 0.9, 0.92, 1.1, 1.3). Our proposed measures prefer A to B, C to D, and D to A.\n\nThe view of fairness as some form of equality is indeed an important perspective in the moral\nevaluation of algorithmic decision making systems\u2014decision subjects often compare their outcomes\nwith other similarly situated individuals, and these interpersonal comparisons play a key role in\nshaping their judgment of the system. We argue, however, that equality is not the only factor at\nplay: we draw attention to two important, yet largely overlooked aspects of evaluating fairness of\nautomated decision making systems\u2014namely risk and welfare1 considerations. The importance of\nthese factors is perhaps best illustrated via a simple example.\nExample 1 Suppose we have four decision making models A, B, C, D each resulting in a different\nbene\ufb01t distribution across 5 groups/individuals i1, i2, i3, i4, i5 (we will precisely de\ufb01ne in Section 2\nhow bene\ufb01ts are computed, but for the time being and as a concrete example, suppose bene\ufb01ts are\nequivalent to salary predictions made through different regression models). Figure 1 illustrates the\nsetting. Suppose a decision maker is tasked with determining which one of these alternatives is\nethically more desirable. From an inequality minimizing perspective, A is clearly more desirable\nthan B: note that both A, B result in the same total bene\ufb01t of 4, and A distributes it equally across i1,\n..., i5. With a similar reasoning, C is preferred to D. Notice, however, that by focusing on equality\nalone, A would be deemed more desirable than D, but there is an issue with this conclusion: almost\neveryone\u2014expect for i1 who sees a negligible drop of less than 2% in their bene\ufb01t\u2014is signi\ufb01cantly\nbetter off under D compared to A.2 In other words, even though D results in unequal bene\ufb01ts and it\ndoes not Pareto-dominate A, collectively it results in higher welfare and lower risk, and therefore,\nboth intuitively and from a rational point of view, it should be considered more desirable. With a\nsimilar reasoning, the decision maker should conclude C is more desirable than A, even though both\nprovide bene\ufb01ts equally to all individuals.\n\nIn light of this example and inspired by the long line of research on distributive justice in economics, in\nthis paper we propose a natural family of measures for evaluating algorithmic fairness corresponding\nto the well-studied notions of cardinal social welfare in economics [Harsanyi, 1953, 1955]. Our\nproposed measures indeed prefer A to B, C to D, and D to A.\nThe interpretation of social welfare as a measure of fairness is justi\ufb01ed by the concept of veil of\nignorance (see [Freeman, 2016] for the philosophical background). Rawls [2009] proposes \u201cveil\nof ignorance\u201d as the ideal condition/mental state under which a policy maker can select the fairest\namong a number of political alternatives. He suggests that the policy maker performs the following\nthought experiment: imagine him/herself as an individual who knows nothing about the particular\nposition they will be born in within the society, and is tasked with selecting the most just among a set\nof alternatives. According to the utilitarian doctrine in this hypothetical original/ex-ante position if\nthe individual is rational, they would aim to minimize risk and insure against unlucky events in which\nthey turn out to assume the position of a low-bene\ufb01t individual. Note that decision making behind a\n1We de\ufb01ne welfare precisely in Sec. 2, but for now it can be taken as the sum of bene\ufb01ts across all subjects.\n2In political philosophy, this problem is sometimes referred to as the \u201cleveling down objection to equality\u201d.\n\n2\n\n\fveil of ignorance is a purely imaginary condition: the decision maker can never in actuality be in\nthis position, nonetheless, the thought experiment is useful in detaching him/her from the needs and\nwishes of a particular person/group, and consequently making a fair judgment. Our main conceptual\ncontribution is to measure fairness in the context of algorithmic decision making by evaluating it\nfrom behind a veil of ignorance: our proposal is for the ML expert wishing to train a fair decision\nmaking model (e.g. to decide whether salary predictions are to be made using a neural network or a\ndecision tree) to perform the aforementioned thought experiment: He/she should evaluate fairness\nof each alternative by taking the perspective of the algorithmic decision making subjects\u2014but not\nany particular one of them: he/she must imagine themselves in a hypothetical setting where they\nknow they will be born as one of the subjects, but don\u2019t know in advance which one. We consider the\nalternative he/she deems best behind this veil of ignorance to be the fairest.\nTo formalize the above, our core idea consists of comparing the expected utility a randomly chosen,\nrisk-averse subject of algorithmic decision making receives under different predictive models. In\nthe example above, if one is to choose between models A, D without knowing which one of the 5\nindividuals they will be, then the risk associated with alternative D is much less than that of A\u2014under\nA the individual is going to receive a (relatively low) bene\ufb01t of 0.8 with certainty, whereas under D\nwith high probability (i.e. 4/5) they obtain a (relatively large) bene\ufb01t of 0.9 or more, and with low\nprobability (1/5) they receive a bene\ufb01t of 0.78, roughly the same as the level of bene\ufb01t they would\nattain under A. Such considerations of risk is precisely what our proposal seeks to quantify. We\nremark that in comparing two bene\ufb01t distributions of the same mean (e.g. A, B or C, D in our earlier\nexample), our risk-averse measures always prefer the more equal one (A is preferred to B and C\nis preferred to D). See Proposition 2 for the formal statement. Thus, our measures are inherently\nequality preferring. However, the key advantage of our measures of social welfare over those focusing\non inequality manifests when, as we saw in the above example, comparing two bene\ufb01t distributions\nof different means. In such conditions, inequality based measures are insuf\ufb01cient and may result in\nmisleading conclusions, while risk-averse measures of social welfare are better suited to identify the\nfairest alternative. When comparing two bene\ufb01t distributions of the same mean, social welfare and\ninequality would always yield identical conclusions.\nFurthermore and from a computational perspective, our welfare-based measures of fairness are more\nconvenient to work with due to their convex formulation. This allows us to integrate them as a\nconstraint into any convex loss minimization pipeline, and solve the resulting problem ef\ufb01ciently\nand exactly. Our empirical analysis reveals interesting trade-offs between our proposal and (a)\nprediction accuracy, (b) group discrimination, and (c) Dwork et al.\u2019s notion of individual fairness.\nIn particular, we show how loss in accuracy increases with the degree of risk aversion, \u21b5, and as\nthe lower bound on social welfare, \u2327, becomes more demanding. We observe that the difference\nbetween false positive/negative rates across different social groups consistently decreases with \u2327. The\nimpact of our constraints on demographic parity and Dwork et al.\u2019s notion of individual fairness is\nslightly more nuanced and depends on the type of learning task at hand (regression vs. classi\ufb01cation).\nLast but not least, we provide empirical evidence suggesting that a lower bound on social welfare\noften leads to bounded inequality in algorithmic outcomes; hence presenting the \ufb01rst computationally\nfeasible mechanism for bounding individual-level inequality.\n\n1.1 Related Work\n\nMuch of the existing work on algorithmic fairness has been devoted to the study of discrimination\n(also called statistical- or group-level fairness). Statistical notions require that given a classi\ufb01er, a\ncertain fairness metric is equal across all protected groups (see e.g. [Kleinberg et al., 2017; Zafar et\nal., 2017b,a]). Statistical notions of fairness fail to guarantee fairness at the individual level. Dwork et\nal. [2012] \ufb01rst formalized the notion of individual fairness for classi\ufb01cation learning tasks, requiring\nthat two individuals who are similar with respect to the task at hand receive similar classi\ufb01cation\noutcomes. The formulation relies on the existence of a suitable similarity metric between individuals,\nand as pointed out by Speicher et al., it does not take into account the variation in social desirability of\nvarious outcomes and people\u2019s merit for different decisions. Speicher et al. [2018] recently proposed a\nnew measure for quantifying individual unfairness utilizing income inequality indices from economics\nand applying them to algorithmic bene\ufb01t distributions. Both existing formulations of individual-level\nfairness focus solely on the inter-personal comparisons of algorithmic outcomes/bene\ufb01ts across\nindividuals and do not account for risk and welfare considerations. Furthermore, we are not aware of\ncomputationally ef\ufb01cient mechanisms for bounding either of these notions.\n\n3\n\n\fWe consider our family of measures to belong to the individual category: our welfare-based measures\ndo not require knowledge of individuals\u2019 membership in protected groups, and compose the individual\nlevel utilities through summation. Note that Dwork et al. [2012] propose a stronger notion of\nindividual fairness\u2014one that requires a certain (minimum) condition to hold for every individual.\nAs we will see shortly, a limiting case of our proposal (the limit of \u21b5 = 1) provides a similar\nguarantee in terms of bene\ufb01ts. While our main focus in this work is on individual-level fairness, our\nproposal can be readily extended to measure and constraint group-level unfairness.\nZafar et al. [2017c] recently proposed two preference-based notions of fairness at the group-level,\ncalled preferred treatment and preferred impact. A group-conditional classi\ufb01er satis\ufb01es preferred\ntreatment if no group collectively prefers another group\u2019s classi\ufb01er to their own (in terms of average\nmisclassi\ufb01cation rate). This de\ufb01nition is based on the notion of envy-freeness [Varian, 1974] in\neconomics and applies to group-conditional classi\ufb01ers only. A classi\ufb01er satis\ufb01es preferred impact\nif it Pareto-dominates an existing impact parity classi\ufb01er (i.e. every group is better off using the\nformer classi\ufb01er compared to the latter). Pareto-dominance (to be de\ufb01ned precisely in Section 2)\nleads to a partial ordering among alternatives and usually in practice, does not have much bite (recall,\nfor instance, the comparison between models A, D in our earlier example). Similar to [Zafar et al.,\n2017c], our work can be thought of as a preference-based notions of fairness, but unlike their proposal\nour measures lead to a total ordering among all alternatives, and can be utilized to measure both\nindividual and group-level (un)fairness.\nFurther discussion of related work can be found in Appendix A.\n\n2 Our Proposed Family of Measures\n\nWe consider the standard supervised learning setting: A learning algorithm receives the training\ndata set D = {(xi, yi)}n\ni=1 consisting of n instances, where xi 2X speci\ufb01es the feature vector for\nindividual i and yi 2Y , the ground truth label for him/her. The training data is sampled i.i.d. from\na distribution P on X\u21e5Y . Unless speci\ufb01ed otherwise, we assume X\u2713 Rk, where k denotes the\nnumber of features. To avoid introducing extra notation for an intercept, we assume feature vectors\nare in homogeneous form, i.e. the kth feature value is 1 for every instance. The goal of a learning\nalgorithm is to use the training data to \ufb01t a model (or hypothesis) h : X!Y that accurately predicts\nthe label for new instances. Let H be the hypothesis class consisting of all the models the learning\nalgorithm can choose from. A learning algorithm receives D as the input; then utilizes the data to\nselect a model h 2H that minimizes some notion of loss, LD(h). When h is clear from the context,\nwe use \u02c6yi to refer to h(xi).\nWe assume there exists a bene\ufb01t function b : Y\u21e5Y! R that quanti\ufb01es the bene\ufb01t an individual\nwith ground truth label y receives, if the model predicts label \u02c6y for them.3 The bene\ufb01t function\nis meant to capture the signed discrepancy between an individual\u2019s predicted outcome and their\ntrue/deserved outcome. Throughout, for simplicity we assume higher values of \u02c6y correspond to more\ndesirable outcomes (e.g. loan or salary amount). With this assumption in place, a bene\ufb01t function\nmust assign a high value to an individual if their predicted label is greater (better) than their deserved\nlabel, and a low value if an individual receives a predicted label less (worse) than their deserved\nlabel. The following are a few examples of bene\ufb01t functions that satisfy this: b(y, \u02c6y) = \u02c6y y;\nb(y, \u02c6y) = log1 + e\u02c6yy; b(y, \u02c6y) = \u02c6y/y.\nIn order to maintain the convexity of our fairness constraints, throughout this work, we will focus\non bene\ufb01t functions that are positive and linear in \u02c6y. In general (e.g. when the prediction task\nis regression or multi-class classi\ufb01cation) this limits the bene\ufb01t landscape that can be expressed,\nbut in the important special case of binary classi\ufb01cation, the following Proposition establishes that\nthis restriction is without loss of generality4. That is, we can attach an arbitrary combination of\nbene\ufb01t values to the four possible (y, \u02c6y)-pairs (i.e. false positives, false negatives, true positives, true\nnegative).\n\n3Our formulation allows the bene\ufb01t function to depend on x and other available information about the\nindividual. As long the formulation is linear in the predicted label \u02c6y, our approach remains computationally\nef\ufb01cient. For simplicity and ease of interpretation, however, we focus on bene\ufb01t functions that depend on y and\n\u02c6y, only.\n\n4All proofs can be found in Appendix B.\n\n4\n\n\fProposition 1 For y, \u02c6y 2{ 0, 1}, let \u00afby,\u02c6y 2 R be arbitrary constants specifying the bene\ufb01t an\nindividual with ground truth label y receives when their predicted label is \u02c6y. There exists a linear\nbene\ufb01t function of form cy \u02c6y + dy such that for all y, \u02c6y 2{ 0, 1}, b(y, \u02c6y) = \u00afby,\u02c6y.\nIn order for \u00afb\u2019s in the above proposition to re\ufb02ect the signed discrepancy between y and \u02c6y, it must\nhold that \u00afb1,0 < \u00afb0,0 \uf8ff \u00afb1,1 < \u00afb0,1. Given a model h, we can compute its corresponding bene\ufb01t\npro\ufb01le b = (b1,\u00b7\u00b7\u00b7 , bn) where bi denotes individual i\u2019s bene\ufb01t: bi = b(yi, \u02c6yi). A bene\ufb01t pro\ufb01le b\nPareto-dominates b0 (or in short b \u232b b0), if for all i = 1,\u00b7\u00b7\u00b7 , n, bi b0i.\nFollowing the economic models of risk attitude, we assume the existence of a utility function\nu : R ! R, where u(b) represent the utility derived from algorithmic bene\ufb01t b. We will focus on\nConstant Relative Risk Aversion (CRRA) utility functions. In particular, we take u(b) = b\u21b5 where\n\u21b5 = 1 corresponds to risk-neutral, \u21b5> 1 corresponds to risk-seeking, and 0 \uf8ff \u21b5< 1 corresponds\nto risk-averse preferences. Our main focus in this work is on values of 0 <\u21b5< 1: the larger\none\u2019s initial bene\ufb01t is, the smaller the added utility he/she derives from an increase in his/her bene\ufb01t.\nWhile in principle our model can allow for different risk parameters for different individuals (\u21b5i for\nindividual i), for simplicity throughout we assume all individuals have the same risk parameter. Our\nmeasures assess the fairness of a decision making model via the expected utility a randomly chosen,\nrisk-averse individual receives as the result of being subject to decision making through that model.\nFormally, our measure is de\ufb01ned as follows: UP (h) = E(xi,yi)\u21e0P [u (b(yi, h(xi))]. We estimate this\nnPn\nexpectation by UD(h) = 1\nConnection to Cardinal Welfare Our proposed family of measures corresponds to a particular\nsubset of cardinal social welfare functions. At a high level, a cardinal social welfare function is meant\nto rank different distributions of welfare across individuals, as more or less desirable in terms of\ndistributive justice [Moulin, 2004]. More precisely, let W be a welfare function de\ufb01ned over bene\ufb01t\nvectors, such that given any two bene\ufb01t vectors b and b0, b is considered more desirable than b0 if\nand only if W(b) W (b0). The rich body of work on welfare economics offers several axioms to\ncharacterize the set of all welfare functions that pertain to collective rationality or fairness. Any such\nfunction, W, must satisfy the following axioms [Sen, 1977; Roberts, 1980]:\n\ni=1 u(b(yi, h(xi))).\n\n1. Monotonicity: If b0 b, then W(b0) > W(b). That is, if everyone is better off under b0,\nthen W should strictly prefer it to b.\n2. Symmetry: W(b1, . . . , bn) = Wb(1),\u00b7\u00b7\u00b7 , b(n). That is, W does not depend on the\nidentity of the individuals, but only their bene\ufb01t levels.\n3. Independence of unconcerned agents: W should be independent of individuals whose\nbene\ufb01ts remain at the same level. Formally, let (b|ia) be a bene\ufb01t vector that is identical to\nb, expect for the ith component which has been replaced by a. The property requires that\nfor all b, b0, a, c, W(b|ia) \uf8ffW (b0|ia) ,W (b|ic) \uf8ffW (b0|ic).\n\nIt has been shown that every continuous5 social welfare function W with properties 1\u20133 is additive\nand can be represented asPn\ni=1 w(bi). According to the Debreu-Gorman theorem [Debreu, 1959;\nGorman, 1968], if in addition to 1\u20133, W satis\ufb01es:\n\n4. Independence of common scale: For any c > 0, W(b) W (b0) ,W (cb) W (cb0).\nThe simultaneous rescaling of every individual bene\ufb01t, should not affect the relative order\nof b, b0.\n\nthen it belongs to the following one-parameter family: W\u21b5(b1, . . . , bn) =Pn\ni=1 w\u21b5(bi), where (a)\nfor \u21b5> 0, w\u21b5(b) = b\u21b5; (b) for \u21b5 = 0, w\u21b5(b) = ln(b); and (c) for \u21b5< 0, w\u21b5(b) = b\u21b5. Note that\nthe limiting case of \u21b5 ! 1 is equivalent to the leximin ordering (or Rawlsian max-min welfare).\nOur focus in this work is on 0 <\u21b5< 1. In this setting, our measures exhibit aversion to pure\ninequality. More precisely, they satisfy the following important property:\n\n5. Pigou-Dalton transfer principle [Pigou, 1912; Dalton, 1920]: Transferring bene\ufb01t from\na high-bene\ufb01t to a low-bene\ufb01t individual must increase social welfare, that is, for any 1 \uf8ff\ni < j \uf8ff n and 0 << b(j)b(i)\n, W(b(1),\u00b7\u00b7\u00b7 , b(i) + ,\u00b7\u00b7\u00b7 , b(j) ,\u00b7\u00b7\u00b7 , b(n)) > W(b).\n5That is, for every vector b, the set of vectors weakly better than b (i.e. {b0 : b0 \u232b b}) and the set of vectors\n\n2\n\nweakly worse than b (i.e. {b0 : b0 b}) are closed sets.\n\n5\n\n\f2.1 Our In-processing Method to Guarantee Fairness\n\nTo guarantee fairness, we propose minimizing loss subject to a lower bound on our measure:\n\nmin\nh2H\ns.t.\n\nLD(h)\nUD(h) \u2327\n\nwhere the parameter \u2327 speci\ufb01es a lower bound that must be picked carefully to achieve the right\ntradeoff between accuracy and fairness. As a concrete example, when the learning task is linear\nregression, b(y, \u02c6y) = \u02c6y y + 1, and the degree of risk aversion in \u21b5, this optimization amounts to:\n\nmin\n\u27132H\n\ns.t.\n\nnXi=1\nnXi=1\n\n(\u2713.xi yi)2\n\n(\u2713.xi yi + 1)\u21b5 \u2327n\n\n(1)\n\nNote that both the objective function and the constraint in (1) are convex in \u2713, therefore, the\noptimization can be solved ef\ufb01ciently and exactly.\n\nConnection to Inequality Measures Speicher et al. [2018] recently proposed quantifying\nindividual-level unfairness utilizing a particular inequality index, called generalized entropy. This\nmeasure satis\ufb01es four important axioms: symmetry, population invariance, 0-normalization6, and\nthe Pigou\u2013Dalton transfer principle. Our measures satisfy all the aforementioned axioms, except\nfor 0-normalization. Additionally and in contrast with measures of inequality\u2014where the goal\nis to capture interpersonal comparison of bene\ufb01ts\u2014our measure is monotone and independent of\nunconcerned agents. The latter two are the fundamental properties that set our proposal apart from\nmeasures of inequality.\nDespite these fundamental differences, we will shortly observe in Section 3 that lower-bounding our\nmeasures often in practice leads to low inequality. Proposition 2 provides a heuristic explanation\nfor this: Imposing a lower bound on social welfare is equivalent to imposing an upper bound on\ninequality if we restrict attention to the region where bene\ufb01t vectors are all of the same mean. More\nprecisely, for a \ufb01xed mean bene\ufb01t value, our proposed measure of fairness results in the same total\nordering as the Atkinson\u2019s index [Atkinson, 1970]. The index is de\ufb01ned as follows:\n\nA(b1, . . . , bn) =8<:\nnPn\n\n\u00b5\u21e3 1\nnPn\n1 1\n\u00b5 (Qn\n1 1\n\ni=1 b1\ni=1 bi)1/n\n\ni\n\n\u23181/(1)\n\nfor 0 \uf8ff 6= 1\nfor = 1,\n\ni\n\nnPn\n\nwhere \u00b5 = 1\ni=1 bi is the mean bene\ufb01t. Atkinson\u2019s inequality index is a welfare-based measure\nof inequality: The measure compares the actual average bene\ufb01t individuals receive under bene\ufb01t\ndistribution b (i.e. \u00b5) with its Equally Distributed Equivalent (EDE)\u2014the level of bene\ufb01t that if\ni=1 b1\nobtained by every individual, would result in the same level of welfare as that of b (i.e. 1\n).\nIt is easy to verify that for 0 <\u21b5< 1, the generalized entropy and Atkinson index result in the same\ntotal ordering among bene\ufb01t distributions (see Proposition 3 in Appendix B). Furthermore, for a\n\ufb01xed mean bene\ufb01t \u00b5, our measure results in the same indifference curves and total ordering as the\nAtkinson index with = 1 \u21b5.\nProposition 2 Consider two bene\ufb01t vectors b, b0 0 with equal means (\u00b5 = \u00b50). For 0 <\u21b5< 1,\nA1\u21b5(b) A1\u21b5(b0) if and only if W\u21b5(b) \uf8ffW \u21b5(b0).\nTradeoffs Among Different Notions of Fairness We end this section by establishing the existence\nof multilateral tradeoffs among social welfare, accuracy, individual, and statistical notions of fairness.\nWe illustrate this by \ufb01nding the predictive model that optimizes each of these quantities. In Table 1\nwe compare these optimal predictors in two different cases: 1) In the realizable case, we assume\nthe existence of a hypothesis h\u21e4 2H such that y = h\u21e4(x), i.e., h\u21e4 achieves perfect prediction\naccuracy. 2) In the unrealizable case, we assume the existence of a hypothesis h\u21e4 2H , such\nthat h\u21e4(x) = E[y|x]), i.e., h\u21e4 is the Bayes Optimal Predictor. We use the following notations:\n\n6\n\n\fTable 1: Optimal predictions with respect to different fairness notions.\nRegression\n\nClassi\ufb01cation\n\nRealizable\n\u02c6y \u2318 ymax\n\u02c6y = h\u21e4(x)\n\u02c6y \u2318 c\n\u02c6y \u2318 c\n\n\u02c6y \u2318 ymin or \u02c6y = h\u21e4(x)\n\u02c6y \u2318 ymax or \u02c6y = h\u21e4(x)\n\nUnrealizable\n\u02c6y \u2318 ymax\n\u02c6y \u2318 ymax\n\u02c6y \u2318 c\n\u02c6y \u2318 c\n\u02c6y \u2318 ymin\n\u02c6y \u2318 ymax\n\nSocial welfare\nAtkinson index\n\nDwork et al.\u2019s notion\n\nMean difference\n\nPositive residual diff.\nNegative residual diff.\n\nRealizable\n\u02c6y \u2318 1\n\u02c6y = h\u21e4(x)\n\u02c6y \u2318 0 or 1\n\u02c6y \u2318 0 or 1\n\n\u02c6y \u2318 0 or \u02c6y = h\u21e4(x)\n\u02c6y \u2318 1 or \u02c6y = h\u21e4(x)\n\nUnrealizable\n\n\u02c6y \u2318 1\n\u02c6y \u2318 1\n\u02c6y \u2318 1 or 0\n\u02c6y \u2318 1 or 0\n\u02c6y \u2318 0\n\u02c6y \u2318 1\n\nymax = maxh2H,x2X h(x) and ymin = minh2H,x2X h(x). The precise de\ufb01nition of each notion in\nTable 1 can be found in Appendix C.\nAs illustrated in Table 1, there is no unique predictors that simultaneously optimizes social welfare,\naccuracy, individual, and statistical notions of fairness. Take the unrealizable classi\ufb01cation as an\nexample. Optimizing for accuracy requires the predictions to follow the Bayes optimal classi\ufb01er.\nA lower bound on social welfare requires the model to predict the desirable outcome (i.e. 1) for a\nlarge fraction of the population. To guarantee low positive residual difference, all individuals must be\npredicted to belong to the negative class. In the next Section, we will investigate these tradeoffs in\nmore detail and through experiments on two real-world datasets.\n\n3 Experiments\n\nIn this section, we empirically illustrate our proposal, and investigate the tradeoff between our family\nof measures and accuracy, as well as existing de\ufb01nitions of group discrimination and individual\nfairness. We ran our experiments on a classi\ufb01cation data set (Propublica\u2019s COMPAS dataset [Larson\net al., 2016]), as well as a regression dataset (Crime and Communities data set [Lichman, 2013]).7\nFor regression, we de\ufb01ned the bene\ufb01t function as follows: b(y, \u02c6y) = \u02c6y y + 1. On the Crime data\nset this results in bene\ufb01t levels between 0 and 2. For classi\ufb01cation, we de\ufb01ned the bene\ufb01t function as\nfollows: b(y, \u02c6y) = cy \u02c6y + dy where y 2 {1, 1}, c1 = 0.5, d1 = 0.5, and c1 = 0.25, d1 = 1.25.\nThis results in bene\ufb01t levels 0 for false negatives, 1 for true positives and true negatives, and 1.5 for\nfalse positives.\n\nWelfare as a Measure of Fairness Our proposed family of measures is relative by design: It allows\nfor meaningful comparison among different unfair alternatives. Furthermore, there is no unique\nvalue of our measures that always correspond to perfect fairness. This is in contrast with previously\nproposed, absolute notions of fairness which characterize the condition of perfect fairness\u2014as\nopposed to measuring the degree of unfairness of various unfair alternatives. We start our empirical\nanalysis by illustrating that our proposed measures can compare and rank different predictive models.\nWe trained the following models on the COMPAS dataset: a multi-layered perceptron, fully connected\nwith one hidden layer with 100 units (NN), the AdaBoost classi\ufb01er (Ada), Logistic Regression (LR),\na decision tree classi\ufb01er (Tree), a nearest neighbor classi\ufb01er (KNN). Figure 2 illustrates how these\nlearning models compare with one another according to accuracy, Atkinson index, and social welfare.\nAll values were computed using 20-fold cross validation. The con\ufb01dence intervals are formed\nassuming samples come from Student\u2019s t distribution. As shown in Figure 2, the rankings obtained\nfrom Atkinson index and social welfare are identical. Note that this is consistent with Proposition 2.\nGiven the fact that all models result in similar mean bene\ufb01ts, we expect the rankings to be consistent.\n\nImpact on Model Parameters Next, we study the impact of changing \u2327 on the trained model\nparameters (see Figure 3a). We observe that as \u2327 increases, the intercept continually rises to guarantee\nhigh levels of bene\ufb01t and social welfare. On the COMPAS dataset, we notice an interesting trend for\nthe binary feature sex (0 is female, 1 is male); initially being male has a negative weight and thus\na negative impact on the classi\ufb01cation outcome, but as \u2327 is increased, the sign changes to positive\nto ensure men also get high bene\ufb01ts. The trade-offs between our proposed measure and prediction\n\n60-normalization requires the inequality index to be 0 if and only if the distribution is perfectly equal/uniform.\n7A more detailed description of the data sets and our preprocessing steps can be found in Appendix C.\n\n7\n\n\fFigure 2: Comparison of different learning models according to accuracy, social welfare (\u21b5 = 0.8)\nand Atkinson index ( = 0.2). The mean bene\ufb01ts are 0.97 for LogReg, 0.96 for NN, 0.96 for\nAdaBoost, 0.89 for KNN, and 0.89 for Tree. Note that for Atkinson measure, smaller values\ncorrespond to fairer outcomes, where as for social welfare larger values re\ufb02ect greater fairness.\n\n(a)\n\n(b)\n\n(c)\n\nFigure 3: (a) Changes in weights\u2014\u2713 in linear and logistic regression\u2014as the function of \u2327. Note the\ncontinuous rise of the intercept with \u2327. (b) Atkinson index as a function of the threshold \u2327. Note the\nconsistent decline in inequality as \u2327 increases. (c) Average violation of Dwork et al.\u2019s constraints as a\nfunction of \u2327. Trends are different for regression and classi\ufb01cation.\n\naccuracy can be found in Figure 5 in Appendix C. As one may expect, imposing more restrictive\nfairness constraints (larger \u2327 and smaller \u21b5), results in higher loss of accuracy.\nNext, we will empirically investigate the tradeoff between our family of measures and existing\nde\ufb01nitions of group discrimination and individual fairness. Note that since our proposed family of\nmeasures is relative, we believe it is more suitable to focus on tradeoffs as opposed to impossibility\nresults. (Existing impossibility results (e.g. [Kleinberg et al., 2017]) establish that a number of\nabsolute notions of fairness cannot hold simultaneously.)\n\nTrade-offs with Individual Notions Figures 3b, 3c illustrate the impact of bounding our measure\non existing individual measures of fairness. As expected, we observe that higher values of \u2327 (i.e.\nsocial welfare) consistently result in lower inequality. Note that for classi\ufb01cation, \u2327 cannot be\narbitrarily large (due to the infeasibility of achieving arbitrarily large social welfare levels). Also as\nexpected, smaller \u21b5 values (i.e. higher degrees of risk aversion) lead to a faster drop in inequality.\nThe impact of our mechanism on the average violation of Dwork et al.\u2019s constraints is slightly more\nnuanced: as \u2327 increases, initially the average violation of Dwork et al.\u2019s pairwise constraints go\ndown. For classi\ufb01cation, the decline continues until the measure reaches 0\u2014which is what we expect\nthe measure to amount to once almost every individual receives the positive label. For regression\nin contrast, the initial decline is followed by a phase in which the measure quickly climbs back up\n\n8\n\n\f(a)\n\n(b)\n\n(c)\n\nFigure 4: Group discrimination as a function of \u2327 for different values of \u21b5. (a) Negative residual\ndifference is decreasing with \u2327 and approaches 0. (b) Positive residual difference monotonically\napproaches a certain asymptote. (c) Note the striking similarity of patterns for the average violation\nof Dwork et al.\u2019s constraints and mean difference.\n\nto its initial (high) value. The reason is for larger values of \u2327, the high level of social welfare is\nachieved mainly by means of adding a large intercept to the unconstrained model\u2019s predictions (see\nFigure 3a). Due to its translation invariance property, the addition of an intercept cannot limit the\naverage violation of Dwork et al.\u2019s constraints.\n\nTrade-offs with Statistical Notions Next, we illustrate the impact of bounding our measure on\nstatistical measures of fairness. For the Crime and Communities dataset, we assumed a neighborhood\nbelongs to the protected group if and only if the majority of its residents are non-Caucasian, that is,\nthe percentage of African-American, Hispanic, and Asian residents of the neighborhood combined,\nis above 50%. For the COMPAS dataset we took race as the sensitive feature. Figure 4a shows the\nimpact of \u2327 and \u21b5 on false negative rate difference and its continuous counterpart, negative residual\ndifference. As expected, both quantities decrease with \u2327 until they reach 0\u2014when everyone receives\na label at least as large as their ground truth. The trends are similar for false positive rate difference\nand its continuous counterpart, positive residual difference (Figure 4b). Note that in contrast to\nclassi\ufb01cation, on our regression data set, even though positive residual difference decreases with \u2327, it\nnever reaches 0. Figure 4c shows the impact of \u2327 and \u21b5 on demographic parity and its continuous\ncounterpart, means difference. Note the striking similarity between this plot and Figure 3c. Again\nhere for large values of \u2327, guaranteeing high social welfare requires adding a large intercept to the\nunconstrained model\u2019s prediction. See Proposition 4 in Appendix B, where we formally prove this\npoint for the special case of linear predictors. The addition of intercept in this fashion, cannot put an\nupper-bound on a translation-invariant measure like mean difference.\n\n4 Summary and Future Directions\n\nOur work makes an important connection between the growing literature on fairness for machine\nlearning, and the long-established formulations of cardinal social welfare in economics. Thanks to\ntheir convexity, our measures can be bounded as part of any convex loss minimization program. We\nprovided evidence suggesting that constraining our measures often leads to bounded inequality in\nalgorithmic outcomes. Our focus in this work was on a normative theory of how rational individuals\nshould compare different algorithmic alternatives. We plan to extend our framework to descriptive\nbehavioural theories, such as prospect theory [Kahneman and Tversky, 2013], to explore the human\nperception of fairness and contrast it with normative prescriptions.\n\n9\n\n\fAcknowledgments\n\nH. Heidari and A. Krause acknowledge support from CTI grant no. 27248.1 PFES-ES. Krishna P.\nGummadi was supported in part by a European Research Council (ERC) Advanced Grant \u201cFounda-\ntions for Fair Social Computing\u201d (No. 789373).\n\nReferences\nYoram Amiel and Frank A. Cowell. Inequality, welfare and monotonicity. In Inequality, Welfare and\n\nPoverty: Theory and Measurement, pages 35\u201346. Emerald Group Publishing Limited, 2003.\nJulia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. Machine bias. Propublica, 2016.\nAnthony B. Atkinson. On the measurement of inequality. Journal of Economic Theory, 2(3):244\u2013263,\n\n1970.\n\nAnna Barry-Jester, Ben Casselman, and Dana Goldstein. The new science of sentencing. The\n\nMarshall Project, August 2015.\n\nToon Calders, Asim Karim, Faisal Kamiran, Wasif Ali, and Xiangliang Zhang. Controlling attribute\neffect in linear regression. In Proceedings of the International Conference on Data Mining, pages\n71\u201380. IEEE, 2013.\n\nFredrik Carlsson, Dinky Daruvala, and Olof Johansson-Stenman. Are people inequality-averse, or\n\njust risk-averse? Economica, 72(287):375\u2013396, 2005.\n\nSam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision\nmaking and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference\non Knowledge Discovery and Data Mining, pages 797\u2013806. ACM, 2017.\n\nFrank A. Cowell and Erik Schokkaert. Risk perceptions and distributional judgments. European\n\nEconomic Review, 45(4-6):941\u2013952, 2001.\n\nCamilo Dagum. On the relationship between income inequality measures and social welfare functions.\n\nJournal of Econometrics, 43(1-2):91\u2013102, 1990.\n\nHugh Dalton. The measurement of the inequality of incomes. The Economic Journal, 30(119):348\u2013\n\n361, 1920.\n\nGerard Debreu. Topological methods in cardinal utility theory. Technical report, Cowles Foundation\n\nfor Research in Economics, Yale University, 1959.\n\nCynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through\nawareness. In Proceedings of the Innovations in Theoretical Computer Science Conference, pages\n214\u2013226. ACM, 2012.\n\nMichael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubra-\nmanian. Certifying and removing disparate impact. In Proceedings of the International Conference\non Knowledge Discovery and Data Mining, pages 259\u2013268. ACM, 2015.\n\nSamuel Freeman. Original position. In Edward N. Zalta, editor, The Stanford Encyclopedia of\n\nPhilosophy. Metaphysics Research Lab, Stanford University, winter 2016 edition, 2016.\n\nWilliam M. Gorman. The structure of utility functions. The Review of Economic Studies, 35(4):367\u2013\n\n390, 1968.\n\nMoritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning.\nProceedings of Advances in Neural Information Processing Systems, pages 3315\u20133323, 2016.\n\nIn\n\nJohn C. Harsanyi. Cardinal utility in welfare economics and in the theory of risk-taking. Journal of\n\nPolitical Economy, 61(5):434\u2013435, 1953.\n\nJohn C. Harsanyi. Cardinal welfare, individualistic ethics, and interpersonal comparisons of utility.\n\nJournal of political economy, 63(4):309\u2013321, 1955.\n\n10\n\n\fDaniel Kahneman and Amos Tversky. Prospect theory: An analysis of decision under risk. In\nHandbook of the Fundamentals of Financial Decision Making: Part I, pages 99\u2013127. World\nScienti\ufb01c, 2013.\n\nFaisal Kamiran and Toon Calders. Classifying without discriminating. In Proceedings of the 2nd\n\nInternational Conference on Computer, Control and Communication, pages 1\u20136. IEEE, 2009.\n\nToshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. Fairness-aware learning through regulariza-\ntion approach. In Proceedings of the International Conference on Data Mining Workshops, pages\n643\u2013650. IEEE, 2011.\n\nJon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. Inherent trade-offs in the fair determi-\nnation of risk scores. In In proceedings of the 8th Innovations in Theoretical Computer Science\nConference, 2017.\n\nJeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. Data and analysis for \u2018How we analyzed\nthe COMPAS recidivism algorithm\u2019. https://github.com/propublica/compas-analysis,\n2016.\n\nSam Levin. A beauty contest was judged by AI and the robots didn\u2019t like dark skin. The Guardian,\n\n2016.\n\nM. Lichman. UCI machine learning repository: Communities and crime data set. http://archive.\n\nics.uci.edu/ml/datasets/Communities+and+Crime, 2013.\n\nClair Miller. Can an algorithm hire better than a human? The New York Times, June 25 2015.\n\nRetrieved 4/28/2016.\n\nHerv\u00e9 Moulin. Fair division and collective welfare. MIT press, 2004.\nKevin Petrasic, Benjamin Saul, James Greig, and Matthew Bornfreund. Algorithms and bias: What\n\nlenders need to know. White & Case, 2017.\n\nArthur Cecil Pigou. Wealth and welfare. Macmillan and Company, limited, 1912.\nJohn Rawls. A theory of justice. Harvard university press, 2009.\nKevin W. S. Roberts. Interpersonal comparability and social choice theory. The Review of Economic\n\nStudies, pages 421\u2013439, 1980.\n\nCynthia Rudin. Predictive policing using machine learning to detect patterns of crime. Wired\n\nMagazine, August 2013. Retrieved 4/28/2016.\n\nJoseph Schwartz and Christopher Winship. The welfare approach to measuring inequality. Sociologi-\n\ncal methodology, 11:1\u201336, 1980.\n\nAmartya Sen. On weights and measures:\n\ninformational constraints in social welfare analysis.\n\nEconometrica: Journal of the Econometric Society, pages 1539\u20131572, 1977.\n\nTill Speicher, Hoda Heidari, Nina Grgic-Hlaca, Krishna P. Gummadi, Adish Singla, Adrian Weller,\nand Muhammad Bilal Zafar. A uni\ufb01ed approach to quantifying algorithmic unfairness: Measuring\nindividual and group unfairness via inequality indices.\nIn Proceedings of the International\nConference on Knowledge Discovery and Data Mining, 2018.\n\nLatanya Sweeney. Discrimination in online ad delivery. Queue, 11(3):10, 2013.\nHal R. Varian. Equity, envy, and ef\ufb01ciency. Journal of economic theory, 9(1):63\u201391, 1974.\nMuhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. Fairness\nbeyond disparate treatment & disparate impact: Learning classi\ufb01cation without disparate mistreat-\nment. In Proceedings of the 26th International Conference on World Wide Web, pages 1171\u20131180,\n2017.\n\nMuhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. Fairness\nconstraints: Mechanisms for fair classi\ufb01cation. In Proceedings of the 20th International Conference\non Arti\ufb01cial Intelligence and Statistics, 2017.\n\n11\n\n\fMuhammad Bilal Zafar, Isabel Valera, Manuel Rodriguez, Krishna Gummadi, and Adrian Weller.\nFrom parity to preference-based notions of fairness in classi\ufb01cation. In Proceedings of Advances\nin Neural Information Processing Systems, pages 228\u2013238, 2017.\n\n12\n\n\f", "award": [], "sourceid": 662, "authors": [{"given_name": "Hoda", "family_name": "Heidari", "institution": "ETH Z\u00fcrich"}, {"given_name": "Claudio", "family_name": "Ferrari", "institution": "ETH Z\u00fcrich"}, {"given_name": "Krishna", "family_name": "Gummadi", "institution": "Max Planck Institute for Software Systems"}, {"given_name": "Andreas", "family_name": "Krause", "institution": "ETH Zurich"}]}