{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# The monotonicity axiom implies the symmetry axiom for Shapley values\n",
    "\n",
    "The <a href=\"https://en.wikipedia.org/wiki/Shapley_value\">Shapley values</a> are a well known way in coalitional game theory of assigning credit for an outcome among a set of players. Since any arbitrary mapping between player sets and outcomes is possible, how to assign individual credit can be unclear. The Shapley values are one such way of assigning credit and, importantly, they are the only way that satifies some very basic and desirable properties. Typically these properties are given as four axioms, where the Shapley values are the only credit assignment method that satisfies all four axioms:\n",
    "\n",
    "1. **Efficiency**\n",
    "\\begin{equation}\n",
    "f(x) = \\sum_{i=0}^M \\phi_i\n",
    "\\end{equation}\n",
    "This assumption forces the model to correctly capture the original predicted value.\n",
    "\n",
    "2. **Symmetry**.  Let $1_S \\in \\{0,1\\}^M$ be an indicator vector equal to $1$ for indexes $i \\in S$, and $0$ elsewhere, and let $f_x(S) = f(h_x^{-1}(1_S))$. If for all subsets $S$ that do not contain $i$ or $j$\n",
    "\\begin{equation}\n",
    "f_x(S \\cup \\{i\\}) = f_x(S \\cup \\{j\\})\n",
    "\\end{equation}\n",
    "then $\\phi_i(f,x) = \\phi_j(f,x)$. This states that if two features  contribute equally to the model then their effects must be the same.\n",
    "\n",
    "3. **Null effects**.  If for all subsets $S$ that do not contain $i$\n",
    "\\begin{equation}\n",
    "f_x(S \\cup \\{i\\}) = f_x(S)\n",
    "\\end{equation}\n",
    "then $\\phi_i(f,x) = 0$. A feature ignored by the model must have an effect of $0$.\n",
    "\n",
    "4. **Linearity**.  For any two models $f$ and $f'$\n",
    "\\begin{equation}\n",
    "\\phi_i(f+f',x) = \\phi_i(f,x) + \\phi_i(f',x).\n",
    "\\end{equation}\n",
    "This states that the effect a feature has on the sum of two functions is the effect is has on one function plus the effect it has on the other.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Young showed in 1985 that linearity and null effects can be eliminated using a monotonicity axiom\n",
    "\n",
    "**Monotonicity** For any two model functions $f$ and $f'$ if for all subsets $S$ of the simplified input features $Z$ that do not contain $i$\n",
    "\\begin{equation}\n",
    "f_x(S \\cup \\{i\\}) - f_x(S) \\ge f'_x(S \\cup \\{i\\}) - f'_x(S)\n",
    "\\end{equation}\n",
    "then\n",
    "\\begin{equation}\n",
    "\\phi_i(f,x) \\ge \\phi_i(f',x)\n",
    "\\end{equation}\n",
    "\n",
    "## Here we show how the symmetry axiom is also implied by the monotonicity axiom for models\n",
    "\n",
    "Assume that $f'$ is the same as $f$ except the inputs $i$ and $j$ are swapped. The means for all subsets $S$ that do not contain $i$ or $j$ that $f'_x(S \\cup \\{i\\}) = f_x(S \\cup \\{j\\})$ and $f'_x(S) = f_x(S)$. If $S$ does contain $j$ then $f'_x(S\\setminus \\{j\\} \\cup \\{i,j\\}) = f_x(S\\setminus \\{j\\} \\cup \\{j,i\\})$, which is garunteed to hold so we can ignore it in the implication below. Starting with the monotonicity axiom\n",
    "\\begin{equation}\n",
    "\\forall_{S \\subset Z \\setminus \\{i\\}} ~~~ f_x(S \\cup \\{i\\}) - f_x(S) \\ge f'_x(S \\cup \\{i\\}) - f'_x(S) \\implies \\phi_i(f,x) \\ge \\phi_i(f',x)\n",
    "\\end{equation}\n",
    "can be transformed into \n",
    "\\begin{equation}\n",
    "\\forall_{S \\subset Z \\setminus \\{i,j\\}} ~~~ f_x(S \\cup \\{i\\}) \\ge f_x(S \\cup \\{j\\}) \\implies \\phi_i(f,x) \\ge \\phi_i(f',x)\n",
    "\\end{equation}\n",
    "by using $f'_x(S \\cup \\{i\\}) = f_x(S \\cup \\{j\\})$, $f'_x(S) = f_x(S)$, and ignoring the terms that include $j$.\n",
    "Swapping $i$ and $j$ and then repreating the process shows that\n",
    "\\begin{equation}\n",
    "\\forall_{S \\subset Z \\setminus \\{i,j\\}} ~~~ f_x(S \\cup \\{i\\}) = f_x(S \\cup \\{j\\}) \\implies \\phi_i(f,x) = \\phi_j(f,x)\n",
    "\\end{equation}\n",
    "which is the symmetry axiom. This shows that we only need efficiency and monotonicity to uniquely constrain ourselves to using the Shapley values."
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Julia 0.5.0",
   "language": "julia",
   "name": "julia-0.5"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "0.5.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
