{"title": "Fair Algorithms for Clustering", "book": "Advances in Neural Information Processing Systems", "page_first": 4954, "page_last": 4965, "abstract": "We study the problem of finding low-cost {\\em fair clusterings} in data where each data point may belong to many protected groups. Our work significantly generalizes the seminal work of Chierichetti \\etal (NIPS 2017) as follows.\n - We allow the user to specify the parameters that define fair representation. More precisely, these parameters define the maximum over- and minimum under-representation of any group in any cluster.\n - Our clustering algorithm works on any $\\ell_p$-norm objective (e.g. $k$-means, $k$-median, and $k$-center). Indeed, our algorithm transforms any vanilla clustering solution into a fair one incurring only a slight loss in quality.\n - Our algorithm also allows individuals to lie in multiple protected groups. \n In other words, we do not need the protected groups to partition the data and we can maintain fairness across different groups simultaneously.\n\nOur experiments show that on established data sets, our algorithm performs much better in practice than what our theoretical results suggest.", "full_text": "Fair Algorithms for Clustering\n\nSuman K. Bera\nUC Santa Cruz\n\nSanta Cruz, CA 95064\n\nsbera@ucsc.edu\n\nNicolas J. Flores\nDartmouth College\nHanover, NH 03755\n\nDeeparnab Chakrabarty\n\nDartmouth College\nHanover, NH 03755\n\ndeeparnab@dartmouth.edu\n\nMaryam Negahbani\nDartmouth College\nHanover, NH 03755\n\nnicolasflores.19@dartmouth.edu\n\nmaryam@cs.dartmouth.edu\n\nAbstract\n\nWe study the problem of \ufb01nding low-cost fair clusterings in data where each data\npoint may belong to many protected groups. Our work signi\ufb01cantly generalizes the\nseminal work of Chierichetti et al. (NIPS 2017) as follows.\n\n\u2022 We allow the user to specify the parameters that de\ufb01ne fair representation.\nMore precisely, these parameters de\ufb01ne the maximum over- and minimum\nunder-representation of any group in any cluster.\n\n\u2022 Our clustering algorithm works on any \u2113p-norm objective (e.g. k-means, k-\nmedian, and k-center). Indeed, our algorithm transforms any vanilla clustering\nsolution into a fair one incurring only a slight loss in quality.\n\n\u2022 Our algorithm also allows individuals to lie in multiple protected groups. In\nother words, we do not need the protected groups to partition the data and we\ncan maintain fairness across different groups simultaneously.\n\nOur experiments show that on established data sets, our algorithm performs much\nbetter in practice than what our theoretical results suggest.\n\n1\n\nIntroduction\n\nMany important decisions today are made by machine learning algorithms. These range from showing\nadvertisements to customers [49, 23], to awarding home loans [38, 46], to predicting recidivism [6,\n24, 21]. It is important to ensure that such algorithms are fair and are not biased towards or against\nsome speci\ufb01c groups in the population. A considerable amount of work [37, 57, 19, 36, 15, 56, 55]\naddressing this issue has emerged in the recent years.\nOur paper considers fair algorithms for clustering. Clustering is a fundamental unsupervised learning\nproblem where one wants to partition a given data-set. In machine learning, clustering is often\nused for feature generation and enhancement as well. It is thus important to consider the bias and\nunfairness issues when inspecting the quality of clusters. The question of fairness in clustering was\n\ufb01rst asked in the beautiful paper of Chierichetti et al. [19] with subsequent generalizations by R\u00f6sner\nand Schmidt [50].\n\nIn this paper, we give a much more generalized and tunable notion of fairness in clustering than\nthat in [19, 50]. Our main result is that any solution for a wide suite of vanilla clustering objectives\ncan be transformed into fair solutions in our notion with only a slight loss in quality by a simple\nalgorithm.\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fMany works in fairness [15, 19, 50, 14] work within the disparate impact (DI) doctrine [28]. Broadly\nspeaking, the doctrine posits that any \u201cprotected class\u201d must have approximately equal representation\nin the decisions taken (by an algorithm). Although the DI doctrine is a law [1, 27] in the United\nStates, violating the DI doctrine is by itself not illegal [3]; it is illegal only if the violation cannot be\njusti\ufb01ed by the decision maker. In the clustering setting, this translates to the following algorithmic\nquestion : what is the loss in quality of the clustering when all protected classes are required to have\napproximately equal representation in the clusters returned?\nMotivated thus, Chierichetti et al. [19], and later R\u00f6sner and Schmidt [50], model the set of points\nas partitioned into \u2113 colors, and the color proportion of each returned cluster should be similar to\nthat in the original data. There are three shortcomings of these papers: (a) the fairness constraint\nwas too stringent and brittle, (b) good algorithms were given only for the k-center objective, and (c)\nthe color classes weren\u2019t allowed to overlap. We remark that the last restriction is limiting since an\nindividual can lie in multiple protected classes (consider an African-American senior woman). In our\nwork we address all these concerns: we allow the user to specify the fairness constraints, we give\nsimple algorithms with provable theoretical guarantees for a large suite of objective functions, and\nwe allow overlapping protected classes.\nOur fairness notion. We propose a model which extends the model of [19] to have \u2113 \u2265 2 groups\nof people which are allowed to overlap. For each group i, we have two parameters \u03b2i, \u03b1i \u2208 [0, 1].\nMotivated by the DI doctrine, we deem a clustering solution fair if each cluster satis\ufb01es two properties:\n(a) restricted dominance (RD), which asserts that the fraction of people from group i in any cluster is\nat most \u03b1i, and (b) minority protection (MP), which asserts that the fraction of people from group i\nin any cluster is at least \u03b2i. Note that we allow \u03b2i, \u03b1i\u2019s to be arbitrary parameters, and furthermore,\nthey can differ across different groups. This allows our model to provide a lot of \ufb02exibility to users.\nFor instance, our model easily captures the notions de\ufb01ned by [19] and [50].\nWe allow our protected groups to overlap. Nevertheless, the quality of our solutions depend on the\namount of overlap. We de\ufb01ne \u2206 (similar to [15]) to be the maximum number of groups a single\nindividual can be a part of. This parameter, as we argued above, is usually not 1, but can be assumed\nto be a small constant depending on the application.\nOur results. Despite the generality of our model, we show that in a black-box fashion, we can get\nfair algorithms for any \u2113p-norm objective (this includes, k-center, k-median, and the widely used\nk-means objective) if we allow for very small additive violations to the fairness constraint. We show\nthat given any \u03c1-approximation algorithm A for a given objective which could be returning widely\nunfair clusters, we can return a solution which is a (\u03c1 + 2)-approximation to the best clustering\nwhich satis\ufb01es the fairness constraints (Theorem 1). Our solution, however, can violate both the RD\nand MP property additively by 4\u2206 + 3. This is negligible if the clusters are large, and our empirical\nresults show this almost never exceeds 3. Further in our experiments, our cost is at most 15%more\nthan optimum, which is a much better factor compared to (\u03c1 + 2).\nThe black-box feature of our result is useful also in comparing the performance of any particular\nalgorithm A. This helps if one wishes to justify the property of an algorithm one might be already\nusing. Our results can be interpreted to give a way to convert any clustering algorithm to its fair\nversion. Indeed, our method is very simple \u2013 we use the solution returned by A to de\ufb01ne a fair\nassignment problem and show that this problem has a good optimal solution. The fair assignment\nproblem is then solved via iterative rounding which leads to the small additive violations. In the case\nof \u2206 = 1 (disjoint groups), we can get a simpler, one-iteration rounding algorithm.\nComparison with recent works. In a very recent independent and concurrent work, Schmidt et\nal. [51] consider the fair k-means problem in the streaming model with a notion of fairness similar to\nours. However, their results crucially assume that the underlying metric space is Euclidean. Their\nmain contributions are de\ufb01ning \u201cfair coresets\u201d and showing how to compute them in a streaming\nsetting, resulting in signi\ufb01cant reduction in the input size. Although their coreset construction\nalgorithm works with arbitrary number of groups, their fair k-means algorithms assume there are only\ntwo disjoint groups of equal size. Even for this, Schmidt et al. [51] give an (5.5\u03c1 + 1)-approximation,\ngiven any \u03c1-approximation for the vanilla k-means problem; the reader should compare with our\n(\u03c1 + 2)-approximation. Backurs et al. [9] consider the problem of designing scalable algorithm for\nthe fair k-median problem in the Euclidean space. The notion of fairness is balance, as de\ufb01ned by\nChierichetti et al. [19], and hence works only for two disjoint groups. Their approximation ratio is\nOr,b(d log n) where r and b are fairness parameters, and d is the dimension of the Euclidean space. In\n\n2\n\n\fcontrast, our fair k-means and k-median algorithms works in any metric space, with arbitrary number\nof overlapping groups. Independently and concurrently, Ahmadian et al. [4] study the fair k-center\nproblem with only therestricted dominance constraints, and Bercea et al. [10] consider a variety of\nclustering objectives in a fairness model that is similar to ours. Both of these works give similar, but\narguably more complicated algorithms with similar theoretical guarantees as ours. In comparison,\nwe emphasize on a simple, yet powerful unifying framework that can handle any \u2113p-norm objective.\nNone of these above works handle overlapping groups.\n\n1.1 Other related works\n\nFairness in algorithm design has received a lot of attention lately [12, 45, 26, 28, 37, 57, 19, 36,\n15, 56, 55, 15, 14, 22, 40, 29]. Our work falls in the category of designing fair algorithms, and as\nmentioned, we concentrate on the notion of disparate impact. Feldman et al. [28] and Zafar et al. [56]\nstudy the fair classi\ufb01cation problem under this notion. Celis et al. in [15], Celis et al. in [14], and\nChierichetti et al. in [20] study respectively the fair ranking problem, the multiwinner voting problem,\nand the matroid optimization problem; All of these works model fairness through disparate impact.\nChierichetti et al. in [19] \ufb01rst addresses disparate impact for clustering problems in the presence of\ntwo groups, R\u00f6sner and Schmidt [50] generalizes it to more than two groups.\nChen et al. [18] de\ufb01ne a notion of proportionally fair clustering where all possible groups of reasonably\nlarge size are entitled to choose a center for themselves. This work builds on the assumption that\nsometimes the task of identifying protected group itself is untenable. Kleindessner et al. in [41] study\nthe problem of enforcing fair representation in the data points chosen as cluster center. This problem\ncan also be posed as a matroid center problem. Kleindessner et al. in [42] extends the fairness\nnotion to graph spectral clustering problems. Celis et al. in [13] proposes a meta algorithm for the\nclassi\ufb01cation problem under a large class of fairness constraints with respect to multiple non-disjoint\nprotected groups.\nClustering is a ubiquitous problem and has been extensively studied in diverse communities (see [2]\nfor a recent survey). We focus on the work done in the algorithms and optimization community\nfor clustering problems under \u2113p norms. The p = {1, 2, \u221e} norms, that is the k-median, k-means,\nand k-center problems respectively, have been extensively studied. The k-center problem has a\n2-approximation [31, 30] and it is NP-hard to do better [32]. A suite of algorithms [17, 35, 16, 8, 44]\nfor the k-median problem has culminated in a 2.676-approximation [11], and is still an active area of\nresearch. For k-means, the best algorithm is a 9 + \u03b5-approximation due to Ahmadian et al. [5]. For\nthe general p-norm, most of the k-median algorithms imply a constant approximation.\n\n2 Preliminaries\n\nLet C be a set of points (whom we also call \u201cclients\u201d) we want to cluster. Let these points be\nembedded in a metric space (X , d). We let F \u2286 X be the set of possible cluster center locations\n(whom we also call \u201cfacilities\u201d). Note F and C needn\u2019t be disjoint, and indeed F could be equal to\nC. For a set S \u2286 X and a point x \u2208 X , we use d(x, S) to denote miny\u2208S d(x, y). For an integer n,\nwe use [n] to denote the set {1, 2, . . . , n}.\nGiven the metric space (X , d) and an integer parameter k, in the VANILLA (k, p)-CLUSTERING\nproblem the objective is to (a) \u201copen\u201d a subset S \u2286 F of at most k facilities, and (b) \ufb01nd an assign-\nment \u03c6 : C \u2192 S of clients to open facilities so as to minimize Lp(S; \u03c6) := (cid:16)Pv\u2208C d(v, \u03c6(v))p(cid:17)\n.\nIndeed, in this vanilla version with no fairness considerations, every point v \u2208 C would be assigned\nto the closest center in S. The case of p = {1, 2, \u221e}, the k-median, k-means, and k-center problems\nrespectively, have been extensively studied in the literature [31, 30, 17, 35, 16, 8, 44, 11, 5]. Given an\ninstance I of the VANILLA (k, p)-CLUSTERING problem, we use OPTvnll(I) to denote its optimal\nvalue.\nThe next de\ufb01nition formalizes the fair clustering problem which is the main focus of this paper.\nDe\ufb01nition 1 (FAIR (k, p)-CLUSTERING Problem). In the fair version of the clustering problem, one\nis additionally given \u2113 many (not necessarily disjoint) groups of C, namely C1, C2, . . . , C\u2113. We use\n\u2206 to denote the maximum number of groups a single client v \u2208 C can belong to; so if the Cj\u2019s were\ndisjoint we would have \u2206 = 1. One is also given two fairness vectors ~\u03b1, ~\u03b2 \u2208 [0, 1]\u2113.\n\n1\np\n\n3\n\n\fThe objective is to (a) open a subset of facilities S \u2286 F of at most k facilities, and (b) \ufb01nd an\nassignment \u03c6 : C \u2192 S of clients to the open facilities so as to minimize Lp(S; \u03c6), where \u03c6 satis\ufb01es\nthe following fairness constraints.\n\n(cid:12)(cid:12)(cid:12)\n(cid:12)(cid:12)(cid:12)\n\n{v \u2208 Ci : \u03c6(v) = f }(cid:12)(cid:12)(cid:12)\n{v \u2208 Ci : \u03c6(v) = f }(cid:12)(cid:12)(cid:12)\n\n\u2264 \u03b1i \u00b7 (cid:12)(cid:12)(cid:12)\n\u2265 \u03b2i \u00b7 (cid:12)(cid:12)(cid:12)\n\n{v \u2208 C : \u03c6(v) = f }(cid:12)(cid:12)(cid:12)\n{v \u2208 C : \u03c6(v) = f }(cid:12)(cid:12)(cid:12)\n\n, \u2200f \u2208 S , \u2200i \u2208 [\u2113] ,\n\n, \u2200f \u2208 S , \u2200i \u2208 [\u2113] ,\n\n(RD)\n\n(MP)\n\nThe assignment \u03c6 de\ufb01nes a cluster {v : \u03c6(v) = f } around every open facility f \u2208 S. As explained\nin the Introduction, eq. (RD) is the restricted dominance property which upper bounds the ratio of\nany group\u2019s participation in a cluster, and eq. (MP) is the minority protection property which lower\nbounds this ratio to protect against under-representation. Due to these fairness constraints, we can no\nlonger assume \u03c6(v) is the nearest open facility in S to v. Indeed, we use the tuple (S, \u03c6) to denote a\nfair-clustering solution.\nWe use OPTfair(I) to denote the optimal value of any instance I of the FAIR (k, p)-CLUSTERING\nproblem. Since I is also an instance of the vanilla problem, and since every fair solution is also a\nvanilla solution (but not necessarily vice versa) we get OPTvnll(I) \u2264 OPTfair(I) for any I.\n\nA fair clustering solution (S, \u03c6) has \u03bb-additive violation, if the eq. (RD) and eq. (MP) constraints are\nsatis\ufb01ed upto \u00b1\u03bb-violation. More precisely, for any f \u2208 S and for any group i \u2208 [\u2113], we have\n\n\u03b2i \u00b7 (cid:12)(cid:12)(cid:12)\n\n{v \u2208 C : \u03c6(v) = f }(cid:12)(cid:12)(cid:12)\n\n\u2212 \u03bb \u2264 (cid:12)(cid:12)(cid:12)\n\n{v \u2208 Ci : \u03c6(v) = f }(cid:12)(cid:12)(cid:12)\n\n\u2264 \u03b1i \u00b7 (cid:12)(cid:12)(cid:12)\n\n{v \u2208 C : \u03c6(v) = f }(cid:12)(cid:12)(cid:12)\n\n+ \u03bb (V)\n\nOur main result is the following.\nTheorem 1. Given a \u03c1-approximate algorithm A for the VANILLA (k, p)-CLUSTERING problem,\nwe can return a (\u03c1 +2)-approximate solution (S, \u03c6) with (4\u2206 + 3)-additive violation for the FAIR\n(k, p)-CLUSTERING problem.\n\nIn particular, we get O(1)-factor approximations to the FAIR (k, p)-CLUSTERING problem with\nO(\u2206) additive violation, for any \u2113p norm. Furthermore, for the important special case of \u2206 = 1, our\nadditive violation is at most +3.\n\n3 Algorithm for the FAIR (k, p)-CLUSTERING problem\n\nOur algorithm is a simple two step procedure. First, we solve the VANILLA (k, p)-CLUSTERING\nproblem using some algorithm A, and \ufb01x the centers S opened by A. Then, we solve a fair\nreassignment problem, called FAIR p-ASSIGNMENT problem, on the same set of facilities to get\nassignment \u03c6. We return (S, \u03c6) as our fair solution.\nDe\ufb01nition 2 (FAIR p-ASSIGNMENT Problem). In this problem, we are given the original set of\nclients C and a set S \u2286 F with |S| = k. The objective is to \ufb01nd the assignment \u03c6 : C \u2192 S such that\n(a) the constraints eq. (RD) and eq. (MP) are satis\ufb01ed, and (b) Lp(S; \u03c6) is minimized among all such\nsatisfying assignments.\n\nGiven an instance J of the FAIR p-ASSIGNMENT problem, we let OPTasgn(J ) denote its optimum\nvalue. Clearly, given any instance I of the FAIR (k, p)-CLUSTERING problem, if S \u2217 is the optimal\nsubset for I and J is the instance of FAIR p-ASSIGNMENT de\ufb01ned by S \u2217, then OPTfair(I) =\nOPTasgn(J ). A \u03bb-violating algorithm for the FAIR p-ASSIGNMENT problem is allowed to incur\n\u03bb-additive violation to the fairness constraints.\n\n3.1 Reducing FAIR (k, p)-CLUSTERING to FAIR p-ASSIGNMENT\n\nIn this section we present a simple reduction from the FAIR (k, p)-CLUSTERING problem to the FAIR\np-ASSIGNMENT problem that uses a VANILLA (k, p)-CLUSTERING solver as a black-box.\nTheorem 2. Given a \u03c1-approximate algorithm A for the VANILLA (k, p)-CLUSTERING problem and\na \u03bb-violating algorithm B for the FAIR p-ASSIGNMENT problem, there is a (\u03c1 +2)- approximation\nalgorithm for the FAIR (k, p)-CLUSTERING problem with \u03bb-additive violation.\n\n4\n\n\fProof. Given instance I of the FAIR (k, p)-CLUSTERING problem, we run A on I to get a (not-\nnecessarily fair) solution (S, \u03c6). We are guaranteed Lp(S; \u03c6) \u2264 \u03c1 \u00b7 OPTvnll(I) \u2264 \u03c1 \u00b7 OPTfair(I).\nLet J be the instance of FAIR p-ASSIGNMENT obtained by taking S as the set of facilities. We run\nalgorithm B on J to get a \u03bb-violating solution \u02c6\u03c6. We return (S, \u02c6\u03c6).\nBy de\ufb01nition of \u03bb-violating solutions, we get that (S, \u02c6\u03c6) satis\ufb01es eq. (V) and that Lp(S, \u02c6\u03c6) \u2264\nOPTasgn(J ). The proof of the theorem follows from the lemma below.\n\nLemma 3. OPTasgn(J ) \u2264 (\u03c1 + 2) \u00b7 OPTfair(I).\n\nProof. Suppose the optimal solution of I is (S \u2217, \u03c6\u2217) with Lp(S \u2217; \u03c6\u2217) = OPTfair(I). Recall\n(S, \u03c6) is the solution returned by the \u03c1-approximate algorithm A. We describe the existence of an\nassignment \u03c6\u2032 : C \u2192 S such that \u03c6\u2032 satis\ufb01es eq. (RD) and eq. (MP), and Lp(S; \u03c6\u2032) \u2264 (\u03c1 + 2) \u00b7\nOPTfair(I). Since \u03c6\u2032 is a feasible solution of J , the lemma follows. For every f \u2217 \u2208 S \u2217, de\ufb01ne\nnrst(f \u2217) := arg minf \u2208S d(f, f \u2217) be the closest facility in S to f \u2217. For every client v \u2208 C, de\ufb01ne\n\u03c6\u2032(v) := nrst(\u03c6\u2217(v)). See Figure 6 in the supplementary material for an illustrative example. The\nfollowing two claims prove the lemma.\n\nClaim 4. \u03c6\u2032 satis\ufb01es eq. (RD) and eq. (MP)\n\nProof. See Appendix B.\n\nClaim 5. Lp(S; \u03c6\u2032) \u2264 (\u03c1 + 2) OPTfair(I).\n\nProof. Fix a client v \u2208 C. For the sake of brevity, let: f = \u03c6(v), f \u2032 = \u03c6\u2032(v), and f \u2217 = \u03c6\u2217(v). We\nhave\n\nd(v, f \u2032) = d(v, nrst(f \u2217)) \u2264 d(v, f \u2217)+d(f \u2217, nrst(f \u2217)) \u2264 d(v, f \u2217)+d(f \u2217, f ) \u2264 2d(v, f \u2217)+d(v, f )\n\nThe \ufb01rst and third follows from triangle inequality while the second follows from the de\ufb01nition\nof nrst. Therefore, if we de\ufb01ne the assignment cost vectors corresponding to \u03c6, \u03c6\u2032, and \u03c6\u2217 as\n~d = {d(v, \u03c6) : v \u2208 C}, ~d\u2032 = {d(v, \u03c6\u2032) : v \u2208 C}, and ~d\u2217 = {d(v, \u03c6\u2217) : v \u2208 C} respectively, the\nabove equation implies ~d\u2032 \u2264 2 ~d + ~d\u2217. Now note that the Lp is a monotone norm on these vectors,\nand therefore,\n\nLp(S; \u03c6\u2032) = Lp(~d\u2032) \u2264 2Lp( ~d) + Lp( ~d\u2217) = 2Lp(S \u2217; \u03c6\u2217) + Lp(S; \u03c6)\n\nThe proof is complete by noting Lp(S \u2217; \u03c6\u2217) = OPTfair(I) and Lp(S; \u03c6) \u2264 \u03c1 \u00b7 OPTfair(I).\n\n3.2 Algorithm for the FAIR p-ASSIGNMENT problem\n\nTo complete the proof of Theorem 1, we need to give an algorithm for the FAIR p-ASSIGNMENT\nproblem. We present this in Algorithm 1. The following theorem then establishes our main result.\nTheorem 6. There exists a (4\u2206 + 3)-violating algorithm for the FAIR p-ASSIGNMENT problem.\n\nProof. Fix an instance J of the problem. We start by writing a natural LP-relaxation1.\n\nLP := min X\n\nv\u2208C,f \u2208S\n\nd(v, f )pxv,f\n\nxv,f \u2208 [0, 1], \u2200v \u2208 C, f \u2208 S\n\n\u03b2i X\n\nv\u2208C\n\nxv,f \u2264 X\n\nv\u2208Ci\n\nxv,f \u2264 \u03b1i X\n\nv\u2208C\n\nxv,f\n\n\u2200f \u2208 S, \u2200i \u2208 [\u2113]\n\nxv,f = 1\n\nX\n\nf \u2208S\n\n\u2200v \u2208 C\n\n(LP)\n\n(1a)\n\n(1b)\n\nClaim 7. LP \u2264 OPTasgn(J )p.\n\n1This makes sense only for \ufb01nite p. See Remark 1\n\n5\n\n\fLet x\u22c6 be an optimum solution to the above LP. Note that x\u22c6 could have many coordinates fractional.\nIn Algorithm 1, we iteratively round x\u22c6 to an integral solution with the same or better value, but\nwhich violates the fairness constraints by at most 4\u2206 + 3. Our algorithm effectively simulates an\nalgorithm for minimum degree-bounded matroid basis problem (MBDMB henceforth) due to Kir\u00e1ly\net al. [39]. In this problem one is given a matroid M = (X, I), costs on elements in X, a hypergraph\nH = (X, E), and functions f : E \u2192 R and g : E \u2192 R such that f (e) \u2264 g(e) for all e \u2208 E. The\nobjective is to \ufb01nd the minimum cost basis B \u2286 X such that for all e \u2208 E, f (e) \u2264 |B \u2229 e| \u2264 g(e).\nNow we state the main result in Kir\u00e1ly et al [39].\n\nTheorem 8 (Paraphrasing of Theorem 1 in [39]). There exists a polynomial time algorithm that\noutputs a basis B of cost at most OPT, such that f (e) \u2212 2\u2206H + 1 \u2264 |B \u2229 e| \u2264 g(e) + 2\u2206H \u2212 1\nfor each edge e \u2208 E of the hypergraph, where \u2206H = maxv\u2208X |{e \u2208 EH : v \u2208 e}| is the maximum\ndegree of a vertex in the hypergraph H, and OPT is the cost of the natural LP relaxation.\n\nAlgorithm 1 Algorithm for the FAIR p-ASSIGNMENT problem\n\ni=1Ci, ~\u03b1, ~\u03b2 \u2208 [0, 1]\u2113)\n\nv,f = 1, set \u02c6\u03c6(v) = f and remove v from C (and relevant Cis).\n\n\u02c6\u03c6(v) = \u2205 for all v \u2208 C\nsolve the LP given in eq. (1), let x\u22c6 be an optimal solution\nfor each x\u22c6\nv,f for all f \u2208 S\nlet Tf := Pv\u2208C x\u22c6\nv,f for all i \u2208 [\u2113] and f \u2208 S\nlet Tf,i := Pv\u2208Ci\nx\u22c6\nconstruct LP2 as given in eq. (2), only with variables xv,f such that x\u22c6\nwhile there exists a v \u2208 C such that \u02c6\u03c6(v) = \u2205 do\n\n1: procedure FAIRASSIGNMENT((X , d), S, C = \u222a\u2113\n2:\n3:\n4:\n5:\n6:\n7:\n8:\n9:\n10:\n11:\n\nsolve LP2, let x\u22c6 be an optimal solution\nv,f = 0, delete the variable x\u22c6\nfor each x\u22c6\nv,f = 1, set \u02c6\u03c6(v) = f and remove v from C (and relevant Cis). Reduce Tf\nfor each x\u22c6\n\nv,f from LP2\n\nv,f > 0\n\nfor every i \u2208 [\u2113] and f \u2208 S, if |x\u22c6\n\nv,f : 0 < x\u22c6\n\nv,f < 1, v \u2208 Ci| \u2264 2(\u2206 + 1) remove the\n\nv,f : 0 < x\u22c6\n\nv,f < 1, v \u2208 C| \u2264 2(\u2206 + 1) remove the respective\n\n12:\n\n13:\n\nand relevant Tf,i\u2019s by 1.\n\nrespective constraint in eq. (2c)\nfor every f \u2208 S, if |x\u22c6\n\nconstraint in eq. (2b)\n\nLP2 := min X\n\nv\u2208C,f \u2208S\n\nd(v, f )pxv,f\n\nxv,f \u2208 [0, 1], \u2200v \u2208 C, f \u2208 S\n\n\u230aTf \u230b \u2264 X\n\nv\u2208C\n\nxv,f \u2264 \u2308Tf \u2309\n\n\u2200f \u2208 S, \u2200i \u2208 [\u2113]\n\n\u230aTf,i\u230b \u2264 X\n\nv\u2208Ci\n\nxv,f = 1\n\nX\n\nf \u2208S\n\nxv,f \u2264 \u2308Tf,i\u2309\n\n\u2200f \u2208 S, \u2200i \u2208 [\u2113]\n\n\u2200v \u2208 C\n\n(2a)\n\n(2b)\n\n(2c)\n\n(2d)\n\nHowever, rather than posing our problem as an MBDMB instance, we write a natural LP-relaxation\nmore suitable to the task \u2014 this is given in eq. (2). Proof of Theorem 6 is completed by drawing\nparallel to the Kir\u00e1ly et al [39] analysis for the MBDMB problem; the details are in Appendix B.\n\nRemark 1. For the case of p = \u221e, the objective function of eq. (LP) doesn\u2019t make sense. Instead,\none proceeds as follows. We begin with a guess G of OPTasgn(J ); we set xv,f = 0 for all pairs with\nd(v, f ) > G. We then check if eqs. (1a) and (1b) have a feasible solution. If they do not, then our\nguess G is infeasible (too small). If they do, then the proof given above returns an assignment which\nviolates eqs. (RD) and (MP) by additive 4\u2206 + 3, and satis\ufb01es d(v, \u03c6(v)) \u2264 G for all v \u2208 C.\nRemark 2. When \u2206 = 1, that is, the Ci\u2019s are disjoint, we can get an improved +3 additive violation\n(instead of +7). Instead of using Theorem 8, we use the generalized assignment problem (GAP)\nrounding technique by Shmoys and Tardos [52] to achieve this.\n\n6\n\n\fRemark 3. Is having a bicriteria approximation necessary? We do not know. The nub is the FAIR\np-ASSIGNMENT problem. It is not hard to show that deciding whether a \u03bb-violating solution exists\nwith \u03bb = 0 under the given de\ufb01nition is NP-hard. 2 However, an algorithm with \u03bb = 0 and cost\nwithin a constant factor of OPTasgn(J ) is not ruled out. This is an interesting open question.\n\n4 Experiments\n\nIn this section, we perform empirical evaluation of our algorithm. We implement our algorithm in\nPython 3.6 and run all our experiments on a Macbook Air with a 1.8 GHz Intel Core i5 Processor and\n8 GB 1600 MHz DDR3 memory. We use CPLEX[34] for solving LP\u2019s. Based on our experiments,\nwe report four key \ufb01ndings: (1) Vanilla clustering algorithms are quite unfair even when measured\nagainst relaxed settings of \u03b1 and \u03b2. In contrast, our algorithm\u2019s additive violation is almost always\nless than 3, even with \u2206 = 2, across a wide range of parameters (see \ufb01g. 1 and \ufb01g. 2). (2) The cost\nof our fair clustering is close to the (unfair) vanilla cost for k \u2264 10 as in \ufb01g. 3. In fact, we see (in\n\ufb01g. 7 in Appendix C.3) that our algorithm\u2019s cost is very close to the absolute best fair clustering\nthat allows additive violations! Furthermore, our results for k-median signi\ufb01cantly improve over the\ncosts reported in Chierichetti et al. [19] and Backurs et al. [9] (see Table 1 ). (3) For the case of\noverlapping protected groups (\u2206 > 1), enforcing fairness with respect to one sensitive attribute (say\ngender) can lead to unfairness with respect to another (say race). This empirical evidence stresses the\nimportance of considering \u2206 > 1 (see \ufb01g. 2 in Appendix C.4). (4) Finally, we study how the cost of\nour fair clustering algorithm changes with the strictness of the fairness conditions. This enables the\nuser to \ufb01gure out the trade-offs between fairness and utility and make an informed decision about\nwhich threshold to choose (see Appendix C.5).\nDatasets. We use \ufb01ve datasets from the UCI repository [25]: 3 (1) bank [54] with 4,521 points,\ncorresponding to phone calls from a marketing campaign by a Portuguese banking institution.\n(2) census [43] with 32,561 points, representing information about individuals extracted from the\n1994 US census. (3) diabetes [53] with 101,766 points, extracted from diabetes patient records.\n(4) creditcard [33] with 30,000 points, related to information on credit card holders from a certain\ncredit card in Taiwan. (5) census1990 [47] with 2,458,285 points, taken from the 1990 US census,\nwhich we use for run time analysis. For each of the datasets, we select a set of numerical attributes to\nrepresent the records in the Euclidean space. We also choose two sensitive attributes for each dataset\n(e.g. sex and race for census ) and create protected groups based on their values. Appendix C.1\ncontains a more detailed description of the datasets and our features.\nMeasurements. For any clustering, we mainly focus on two metrics. One is the cost of fairness,\nthat is, the ratio of the objective values of the fair clustering over the vanilla clustering. The other is\nbalance, the measure of unfairness. To de\ufb01ne balance, we generalize the notion found by Chierichetti\net al. [19], We de\ufb01ne two intermediate values ri, the representation of group i in the dataset and ri(f ),\nthe representation of group i in cluster f as ri := |Ci|/|C| and ri(f ) := |Ci(f )|/|C(f )|. Using\nthese two values, balance is de\ufb01ned as balance(f ) := min{ri/ri(f ), ri(f )/ri} \u2200i \u2208 [\u2113]. Although\nin theory the values of \u03b1, \u03b2 for a given group i can be set arbitrarily, in practice they are best set with\nrespect to ri, the ratio of the group in the dataset. Furthermore, to reduce the degrees of freedom, we\nparameterize \u03b2 and \u03b1 by a single variable \u03b4 such that \u03b2i = ri(1 \u2212 \u03b4) and \u03b1i = ri/(1 \u2212 \u03b4). Thus,\nwe can interpret \u03b4 as how loose our fairness condition is. This is because \u03b4 = 0 corresponds to each\ngroup in each cluster having exactly the same ratio as that group in the dataset, and \u03b4 = 1 corresponds\nto no fairness constraints at all. For all of the experiments, we set \u03b4 = 0.2 (corresponding to the\ncommon interpretation of the 80%-rule of DI doctrine), and use \u2206 = 2, unless otherwise speci\ufb01ed.\nAlgorithms. For vanilla k-center, we use a 2-approximation algorithm due to Gonzalez [30]. For\nvanilla k-median, we use the single-swap 5-approximation algorithm by Arya et al. [8], augment it\nwith the D-sampling procedure by [7] for initial center section, and take the best out of 5 trials. For\nk-means, we use the k-means++ implementation of [48].\nFairness comparison with vanilla clustering. In \ufb01g. 1 we motivate our discussion of fairness by\ndemonstrating the unfairness of vanilla clustering and fairness of our algorithm. On the x-axis, we\ncompare three solutions: (1) our algorithm (labelled \u201cALG\u201d), (2) fractional solution to the FAIR\np-ASSIGNMENT LP in Equation (1) (labelled \u201cPartial\u201d), and (3) vanilla k-means (labelled \u201cVC\u201d).\n\n2A simple reduction from the 3D-matching problem.\n3https://archive.ics.uci.edu/ml/datasets/\n\n7\n\n\fBelow these labels, we record the cost of fairness. We set \u03b4 = 0.2 and k = 4. Along the y axis, we\nplot the balance metric de\ufb01ned above for the three largest clusters for each of these clustering. The\ndotted line at 0.8 is the goal balance for \u03b4 = 0.2. The lowest balance for any cluster for our algorithm\nis 0.75 (for census ), whereas vanilla can be as bad as 0 (for bank ); \u201cpartial\u201d is, of course, always\nfair (at least 0.8). We observe that the maximum additive violation of our algorithm is only 3 (much\n\nFigure 1: Comparison of our algorithm (ALG) versus vanilla clustering (VC) in terms of balance for\nthe k-means objective.\n\nbetter than our theoretical bound of 4\u2206 + 3)), for a large range of values of \u03b4 and k, whereas vanilla\nk-means can be unfair by quite a large margin. (see \ufb01g. 2 below and Table 3 in Appendix C.2).\n\nFigure 2: Comparison of the maximum additive violation (for \u03b4 = 0.2 and \u2206 = 2) over all clusters\nand all groups between our algorithm (ALG) and vanilla (VC), using the k-means objective.\n\nCost analysis. We evaluate the cost of our algorithm for k-means objective with respect to the vanilla\nclustering cost. Figure 3 shows that the cost of our algorithm for k \u2264 10 is at most 15% more than\nthe vanilla cost on bank , census , and creditcard . Interestingly, for creditcard , even though\nthe vanilla solution is extremely unfair as demonstrated earlier, cost of fairness is at most 6% which\nindicates that the vanilla centers are in the \u201cright place\u201d.\n\n8\n\n\fFigure 3: Our algorithm\u2019s cost (ALG) versus the vanilla clustering cost (VC) for k-means objective.\n\nOur results in Table 1 con\ufb01rm that we outperform [19] and [9] in terms of cost. To match [19] and [9],\nwe sub-sample bank , census , and diabetes to 1000, 600, and 1000 respectively, declared only\none sensitive attribute for each (i.e. marital for bank , sex for census , and gender for diabetes ),\nand tune the fairness parameters to enforce a balance of 0.5. The data in table 1 for [9] is the output\nof their code, and the numbers for [19] are drawn from their plots.\n\nTable 1: Comparison of our clustering cost with [9] and [19] for k-median with varying k.\n\ncensus\n\ncost \u00d710\u22126\n\nbank\n\ncost \u00d710\u22125\n\ndiabetes\n\ncost\n\nk\n\nOurs\n[9]\n[19]\nOurs\n[9]\n[19]\nOurs\n[9]\n[19]\n\n3\n19.55\n28.29\n40\n6.81\n8.05\n5.9\n6675\n7756\n11500\n\n4\n16.63\n28.57\n39\n5.64\n7.78\n5.8\n5491\n6412\n10300\n\n5\n14.35\n26.31\n38.5\n4.95\n7.65\n5.77\n3890\n5526\n10250\n\n6\n11.75\n22.21\n38\n4.49\n6.63\n5.75\n3371\n4746\n10200\n\n7\n9.86\n24.81\n37.8\n4.05\n6.33\n5.7\n3194\n4850\n10175\n\n8\n8.87\n26.94\n37.75\n3.79\n6.68\n5.65\n2939\n4765\n10150\n\n9\n7.75\n20.80\n37.6\n3.53\n5.42\n5.62\n2700\n4203\n10125\n\n10\n7.32\n23.60\n37.5\n3.44\n6.70\n5.6\n2380\n4337\n10100\n\nRun time analysis. In this paper, we focus on providing a framework and do not emphasize on run time\noptimization. Nevertheless, we note that our algorithm for the k-means objective \ufb01nds a fair solution for the\ncensus1990 dataset with 500K points and 13 features in less than 30 minutes (see Table 2). Even though our\napproach is based on iterative rounding method, in practice CPLEX solution to LP (eq. (1)) is more than 99%\nintegral for each of our experiments. Hence, we never have to solve more than two or three LP. Also the number\nof variables in subsequent LPs are signi\ufb01cantly small. In contrast, if we attempt to frame LP (eq. (1)) as an\ninteger program instead, the CPLEX solver fails to \ufb01nd a solution in under an hour even with 40K points.\n\nTable 2: Runtime of our algorithm on subsampled data from census1990 for k-means (k = 3).\n\nNumber of sampled points\nTime (sec)\n\n10K 50K\n4.04\n33.35\n\n100K 200K\n91.15\n248.11\n\n300K\n714.73\n\n400K\n1202.89\n\n500K\n1776.51\n\n9\n\n\fReferences\n\n[1] Supreme Court of the United States. Griggs v. Duke Power Co. 401 U.S. 424, March 8 1971.\n\n[2] Charu C Aggarwal and Chandan K Reddy. Data clustering: algorithms and applications. CRC press,\n\n2013.\n\n[3] Herman Aguinis and Wayne F. Cascio. Applied Psychology in Human Resource Management (6th Edition).\n\nPrentice Hall, 2005.\n\n[4] Sara Ahmadian, Alessandro Epasto, Ravi Kumar, and Mohammad Mahdian. Clustering without over-\nIn Proceedings of the 25th ACM SIGKDD International Conference on Knowledge\n\nrepresentation.\nDiscovery & Data Mining, pages 267\u2013275, 2019.\n\n[5] Sara Ahmadian, Ashkan Norouzi-Fard, Ola Svensson, and Justin Ward. Better guarantees for k-means and\nEuclidean k-median by primal-dual algorithms. In Annual IEEE Symposium on Foundations of Computer\nScience, 2017.\n\n[6] J. Angwin, J. Larson, S. Mattu, and L. Kirchner. Machine bias: There\u2019s software used across the country\n\nto predict future criminals. and it\u2019s biased against blacks. ProPublica, May 23 2016.\n\n[7] David Arthur and Sergei Vassilvitskii. K-means++: The advantages of careful seeding. In Proceedings of\n\nthe Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 2007.\n\n[8] Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh Munagala, and Vinayaka Pandit.\nLocal search heuristics for k-median and facility location problems. SIAM Journal on computing, 33(3):544\u2013\n562, 2004.\n\n[9] Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber, Ali Vakilian, and Tal Wagner. Scalable\nfair clustering. In Proc. 36th Proceedings, International Conference on Machine Learning (ICML), pages\n405\u2013413, 2019.\n\n[10] Ioana O. Bercea, Martin Gro\u00df, Samir Khuller, Aounon Kumar, Clemens R\u00f6sner, Daniel R. Schmidt, and\n\nMelanie Schmidt. On the cost of essentially fair clusterings. CoRR, abs/1811.10319, 2018.\n\n[11] Jaros\u0142aw Byrka, Thomas Pensyl, Bartosz Rybicki, Aravind Srinivasan, and Khoa Trinh. An improved\napproximation for k-median, and positive correlation in budgeted optimization. In Annual ACM-SIAM\nSymposium on Discrete Algorithms, 2014.\n\n[12] Toon Calders and Sicco Verwer. Three naive bayes approaches for discrimination-free classi\ufb01cation. Data\n\nMining and Knowledge Discovery, 21(2):277\u2013292, 2010.\n\n[13] L. Elisa Celis, Lingxiao Huang, Vijay Keswani, and Nisheeth K. Vishnoi. Classi\ufb01cation with fairness\nconstraints: A meta-algorithm with provable guarantees. In Proceedings of the Conference on Fairness,\nAccountability, and Transparency, FAT* \u201919, 2019.\n\n[14] L Elisa Celis, Lingxiao Huang, and Nisheeth K Vishnoi. Multiwinner voting with fairness constraints. In\n\nIJCAI, pages 144\u2013151, 2018.\n\n[15] L. Elisa Celis, Damian Straszak, and Nisheeth K. Vishnoi. Ranking with Fairness Constraints. In Proc.\n\n45th International Colloquium on Automata, Languages and Programming, pages 28:1\u201328:15, 2018.\n\n[16] Moses Charikar and Sudipto Guha.\n\nImproved combinatorial algorithms for the facility location and\n\nk-median problems. In Annual IEEE Symposium on Foundations of Computer Science, 1999.\n\n[17] Moses Charikar, Sudipto Guha, \u00c9va Tardos, and David B. Shmoys. A constant-factor approximation\n\nalgorithm for the k-median problem. J. Comput. Syst. Sci., 65(1):129\u2013149, 2002.\n\n[18] Xingyu Chen, Brandon Fain, Charles Lyu, and Kamesh Munagala. Proportionally fair clustering. In Proc.\n\n36th Proceedings, International Conference on Machine Learning (ICML), June 2019.\n\n[19] Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvitskii. Fair clustering through fairlets.\n\nIn Proc. 31st Conference on Neural Information Processing Systems, pages 5029\u20135037, 2017.\n\n[20] Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvtiskii. Matroids, matchings, and\n\nfairness. In Proceedings of Machine Learning Research, volume 89, 2019.\n\n[21] Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction\n\ninstruments. Big data, 5(2):153\u2013163, 2017.\n\n10\n\n\f[22] Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision making\nand the cost of fairness. In Proc. 23rd Annual SIGKDD International Conference on Knowledge Discovery\nand Data Mining, pages 797\u2013806. ACM, 2017.\n\n[23] Jerry Dischler. Putting machine learning into the hands of every advertiser. https://www.blog.google/\n\ntechnology/ads/machine-learning-hands-advertisers/, July 10 2018.\n\n[24] Julia Dressel and Hany Farid. The accuracy, fairness, and limits of predicting recidivism. Science advances,\n\n4, 2018.\n\n[25] Dheeru Dua and Casey Graff. UCI machine learning repository, 2017.\n\n[26] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through\nawareness. In Proc. 3rd Conference on Innovations in Theoretical Computer Science, pages 214\u2013226.\nACM, 2012.\n\n[27] The U.S. EEOC. Uniform guidelines on employee selection procedures, March 2 1979.\n\n[28] Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian.\nCertifying and removing disparate impact. In Proc. 21st Annual SIGKDD International Conference on\nKnowledge Discovery and Data Mining, pages 259\u2013268, 2015.\n\n[29] Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. On the (im)possibility of fairness.\n\nCoRR, abs/1609.07236, 2016.\n\n[30] Teo\ufb01lo F. Gonzalez. Clustering to Minimize the Maximum Intercluster Distance. Theor. Comput. Sci.,\n\n38:293 \u2013 306, 1985.\n\n[31] Dorit S. Hochbaum and David B. Shmoys. A best possible heuristic for the k-center problem. Math. Oper.\n\nRes., 10(2):180\u2013184, 1985.\n\n[32] Wen-Lian Hsu and George L Nemhauser. Easy and hard bottleneck location problems. Discrete Applied\n\nMathematics, 1(3):209\u2013215, 1979.\n\n[33] Che-hui Lien I-Cheng Yeh. The comparisons of data mining techniques for the predictive accuracy of\n\nprobability of default of credit card clients. Expert Systems with Applications, 2009.\n\n[34] IBM. Ibm ilog cplex 12.9. 2019.\n\n[35] Kamal Jain and Vijay V. Vazirani. Approximation algorithms for metric facility location and k-median\n\nproblems using the primal-dual schema and lagrangian relaxation. J. ACM, 48(2):274\u2013296, 2001.\n\n[36] Matthew Joseph, Michael Kearns, Jamie H Morgenstern, and Aaron Roth. Fairness in learning: Classic\nand contextual bandits. In Conference on Neural Information Processing Systems, pages 325\u2013333, 2016.\n\n[37] Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. Fairness-aware classi\ufb01er with\nIn Joint European Conference on Machine Learning and Knowledge\n\nprejudice remover regularizer.\nDiscovery in Databases, pages 35\u201350, 2012.\n\n[38] Amir E Khandani, Adlar J Kim, and Andrew W Lo. Consumer credit-risk models via machine-learning\n\nalgorithms. Journal of Banking & Finance, 34, 2010.\n\n[39] Tam\u00e1s Kir\u00e1ly, Lap Chi Lau, and Mohit Singh. Degree bounded matroids and submodular \ufb02ows. Combina-\n\ntorica, 32(6):703\u2013720, 2012.\n\n[40] Jon M. Kleinberg, Sendhil Mullainathan, and Manish Raghavan. Inherent trade-offs in the fair determination\nof risk scores. In Proc. 8th Conference on Innovations in Theoretical Computer Science, pages 43:1\u201343:23,\n2017.\n\n[41] Matth\u00e4us Kleindessner, Pranjal Awasthi, and Jamie Morgenstern. Fair k-center clustering for data sum-\nmarization. In Proc. 36th Proceedings, International Conference on Machine Learning (ICML), June\n2019.\n\n[42] Matth\u00e4us Kleindessner, Samira Samadi, Pranjal Awasthi, and Jamie Morgenstern. Guarantees for spectral\nclustering with fairness constraints. In Proc. 36th Proceedings, International Conference on Machine\nLearning (ICML), June 2019.\n\n[43] Ron Kohavi. Scaling up the accuracy of naive-bayes classi\ufb01ers: A decision-tree hybrid. In Annual SIGKDD\n\nInternational Conference on Knowledge Discovery and Data Mining, 1996.\n\n11\n\n\f[44] Shi Li and Ola Svensson. Approximating k-median via pseudo-approximation. SIAM J. Comput.,\n\n45(2):530\u2013547, 2016.\n\n[45] Binh Thanh Luong, Salvatore Ruggieri, and Franco Turini. k-nn as an implementation of situation testing\nfor discrimination discovery and prevention. In Proc. 17th Annual SIGKDD International Conference on\nKnowledge Discovery and Data Mining, pages 502\u2013510, 2011.\n\n[46] Rashmi Malhotra and Davinder K Malhotra. Evaluating consumer loans using neural networks. Omega,\n\n31, 2003.\n\n[47] Christopher Meek, Bo Thiesson, and David Heckerman. The learning-curve sampling method applied to\n\nmodel-based clustering. Journal of Machine Learning Research, 2:397, 2002.\n\n[48] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,\nR. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay.\nScikit-learn: Machine learning in Python. Journal of Machine Learning Research, 2011.\n\n[49] Claudia Perlich, Brian Dalessandro, Troy Raeder, Ori Stitelman, and Foster Provost. Machine learning for\n\ntargeted display advertising: Transfer learning in action. Machine learning, 95, 2014.\n\n[50] Clemens R\u00f6sner and Melanie Schmidt. Privacy Preserving Clustering with Constraints. In Proc. 45th\n\nInternational Colloquium on Automata, Languages and Programming, pages 96:1\u201396:14, 2018.\n\n[51] Melanie Schmidt, Chris Schwiegelshohn, and Christian Sohler. Fair coresets and streaming algorithms for\n\nfair k-means clustering. arXiv preprint arXiv:1812.10854, 2018.\n\n[52] David B. Shmoys and \u00c9va Tardos. An approximation algorithm for the generalized assignment problem.\n\nMathematical programming, 62(1-3):461\u2013474, 1993.\n\n[53] Beata Strack, Jonathan P DeShazo, Chris Gennings, Juan L Olmo, Sebastian Ventura, Krzysztof J Cios,\nand John N Clore. Impact of hba1c measurement on hospital readmission rates: analysis of 70,000 clinical\ndatabase patient records. BioMed research international, 2014, 2014.\n\n[54] Paulo Rita S\u00e9rgio Moro, Paulo Cortez. A data-driven approach to predict the success of bank telemarketing.\n\nDecision Support Systems, 2014.\n\n[55] Ke Yang and Julia Stoyanovich. Measuring fairness in ranked outputs.\n\nIn Proc. 29th International\n\nConference on Scienti\ufb01c and Statistical Database Management, page 22. ACM, 2017.\n\n[56] Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez-Rodriguez, and Krishna P. Gummadi. Fairness\nconstraints: Mechanisms for fair classi\ufb01cation. In Proc. 20th Proceedings, International Conference on\nArti\ufb01cial Intelligence and Statistics (AISTATS), pages 962\u2013970, 2017.\n\n[57] Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. Learning fair representations. In\n\nProceedings, International Conference on Machine Learning (ICML), pages 325\u2013333, 2013.\n\n12\n\n\f", "award": [], "sourceid": 2750, "authors": [{"given_name": "Suman", "family_name": "Bera", "institution": "University of California Santa Cruz"}, {"given_name": "Deeparnab", "family_name": "Chakrabarty", "institution": "Dartmouth"}, {"given_name": "Nicolas", "family_name": "Flores", "institution": "Dartmouth College"}, {"given_name": "Maryam", "family_name": "Negahbani", "institution": "Dartmouth College"}]}