Blossom Metevier, Stephen Giguere, Sarah Brockman, Ari Kobren, Yuriy Brun, Emma Brunskill, Philip S. Thomas
We present RobinHood, an ofﬂine contextual bandit algorithm designed to satisfy a broad family of fairness constraints. Our algorithm accepts multiple fairness deﬁnitions and allows users to construct their own unique fairness deﬁnitions for the problem at hand. We provide a theoretical analysis of RobinHood, which includes a proof that it will not return an unfair solution with probability greater than a user-speciﬁed threshold. We validate our algorithm on three applications: a tutoring system in which we conduct a user study and consider multiple unique fairness deﬁnitions; a loan approval setting (using the Statlog German credit data set) in which well-known fairness deﬁnitions are applied; and criminal recidivism (using data released by ProPublica). In each setting, our algorithm is able to produce fair policies that achieve performance competitive with other ofﬂine and online contextual bandit algorithms.