{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Evaluating Attacks and Defenses with `mister_ed`\n",
    "This file will contain code snippets for how to quickly iterate through effectiveness of attacks against (trained) networks. It's highly recommended that you have walked through tutorials 1 and 2 prior to this one. \n",
    "\n",
    "As usual, the first thing we'll want to do is import everything."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# EXTERNAL LIBRARY IMPORTS\n",
    "import numpy as np \n",
    "import scipy \n",
    "\n",
    "import torch # Need torch version >=0.3\n",
    "import torch.nn as nn \n",
    "import torch.optim as optim \n",
    "assert float(torch.__version__[:3]) >= 0.3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# MISTER ED SPECIFIC IMPORT BLOCK\n",
    "# (here we do things so relative imports work )\n",
    "# Universal import block \n",
    "# Block to get the relative imports working \n",
    "import os\n",
    "import sys \n",
    "module_path = os.path.abspath(os.path.join('..'))\n",
    "if module_path not in sys.path:\n",
    "    sys.path.append(module_path)\n",
    "\n",
    "\n",
    "import config\n",
    "import prebuilt_loss_functions as plf\n",
    "import loss_functions as lf \n",
    "import utils.pytorch_utils as utils\n",
    "import utils.image_utils as img_utils\n",
    "import cifar10.cifar_loader as cifar_loader\n",
    "import cifar10.cifar_resnets as cifar_resnets\n",
    "import adversarial_training as advtrain\n",
    "import adversarial_evaluation as adveval\n",
    "import utils.checkpoints as checkpoints\n",
    "import adversarial_perturbations as ap \n",
    "import adversarial_attacks as aa\n",
    "import spatial_transformers as st\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this file we'll be looking at the techniques we'll use to evaluate both attacks and defenses. In general, the task we want to solve is this: we have a classifier trained on a dataset and wish to evaluate its accuracy against unperturbed inputs as well as various properties of an adversarial attack that has gradient access to this classifier. \n",
    "\n",
    "Recall that an adversarial attack here has many degrees of freedom we can choose:\n",
    "- Threat model: $\\ell_p$-bounded noise, rotations, translations, flow, any combination of the above\n",
    "- Bounds for the threat model\n",
    "- Attack technique: PGD, FGSM, Carlini-Wagner\n",
    "- Attack parameters: number of iterations, step size, loss functions, etc\n",
    "\n",
    "And we can choose to evaluate several properties of each attack on a network: \n",
    "- Top-k accuracy \n",
    "- Average loss value of successful attacks (i.e. average loss value for examples in which the attack causes the index of the maximum logit to change)\n",
    "- The generated adversarial images \n",
    "- Average distance (say according to a custom function) of generated adversarial images to their originals\n",
    "\n",
    "All we'll be doing in this file is walking through an example of how to build objects to perform evaluations of (some of) these properties on a medley of attacks. \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Building an AdversarialEvaluationObject"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<img src=\"images/adversarial_evaluation.png\",width=60,height=60>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%html\n",
    "<img src=\"images/adversarial_evaluation.png\",width=60,height=60>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The above image describes the general workflow: \n",
    "First we initialize an `AdversarialEvaluation` instance which keeps track of which classifier we're evaluating against, as well as the normalizer (which recall just performs some operations on raw-data to make it classifier-friendly). This instance will have an `evaluate_ensemble` method which needs as arguments a DataLoader and a dictionary, called the `attack_ensemble`, that contains the attacks (which are wrapped up in `EvaluationResult` instances). This method will output a dictionary that points to the same `EvaluationResult` objects which now have the result data stored in them. Unless otherwise specified, we'll also evaluate the ground accuracy of the classifier and include that in the return-value as well.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's go ahead and build up everything except the `EvaluationResult` objects and proceed from there.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Files already downloaded and verified\n"
     ]
    }
   ],
   "source": [
    "# Load the trained model and normalizer\n",
    "model, normalizer = cifar_loader.load_pretrained_cifar_resnet(flavor=20, return_normalizer=True) \n",
    "\n",
    "# Load the evaluation dataset \n",
    "cifar_valset = cifar_loader.load_cifar_data('val') \n",
    "\n",
    "# Put this into the AdversarialEvaluation object\n",
    "adv_eval_object = adveval.AdversarialEvaluation(model, normalizer)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Building an Attack Ensemble"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Recall in tutorial_1 we built `AdversarialAttack` objects and used their `.attack(...)` methods to generate adversarial perturbations, where the keyword arguments to `.attack(...)` described the parameters of the attack.\n",
    "\n",
    "And then in tutorial_2 we build `AdversarialAttackParameters` objects which is a wrapper to hold an `AdversarialAttack` object and the kwargs that described the parameters of the attack. We used this to generate attacks inside the training loop to perform adversarial training.\n",
    "\n",
    "And finally, in this tutorial we'll build `EvaluationResult` objects which hold an `AdversarialAttackParameters` object and a dictionary storing some information about what we'll evaluate.\n",
    "\n",
    "The following image summarizes the data structures we've built (the bullet points refer to the arguments needed upon construction)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<img src=\"images/evaluationResult_ds.png\",width=60,height=60>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%html \n",
    "<img src=\"images/evaluationResult_ds.png\",width=60,height=60>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this worked example, we'll build 3 different evaluation results and evaluate them simultaneously:\n",
    "- **FGSM8**: An additive noise attack, with $\\ell_\\infty$ bound of 8.0, attacked using FGSM \n",
    "- **PGD4**: An additive noise attack, with $\\ell_\\infty$ bound of 4.0, attacked using PGD \n",
    "- **PGD8**: An additive noise attack, with $\\ell_\\infty$ bound of 8.0, attacked using PGD"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# First let's build the attack parameters for each.\n",
    "# Note: we're not doing anything new yet. These constructions are covered in the first two tutorials\n",
    "\n",
    "# we'll reuse the loss function:\n",
    "attack_loss = plf.VanillaXentropy(model, normalizer)\n",
    "linf_8_threat = ap.ThreatModel(ap.DeltaAddition, {'lp_style': 'inf', \n",
    "                                                 'lp_bound': 8.0 / 255.0})\n",
    "linf_4_threat = ap.ThreatModel(ap.DeltaAddition, {'lp_style': 'inf', \n",
    "                                                  'lp_bound': 4.0 / 255.0})\n",
    "\n",
    "\n",
    "#------ FGSM8 Block \n",
    "fgsm8_threat = ap.ThreatModel(ap.DeltaAddition, {'lp_style': 'inf', \n",
    "                                                 'lp_bound': 8.0/ 255.0})\n",
    "fgsm8_attack = aa.FGSM(model, normalizer, linf_8_threat, attack_loss)\n",
    "fgsm8_attack_kwargs = {'step_size': 0.05, \n",
    "                       'verbose': False}\n",
    "fgsm8_attack_params = advtrain.AdversarialAttackParameters(fgsm8_attack,\n",
    "                                                           attack_specific_params=\n",
    "                                                           {'attack_kwargs': fgsm8_attack_kwargs})\n",
    "\n",
    "\n",
    "# ------ PGD4 Block \n",
    "pgd4_attack = aa.PGD(model, normalizer, linf_4_threat, attack_loss)\n",
    "pgd4_attack_kwargs = {'step_size': 1.0 / 255.0, \n",
    "                      'num_iterations': 20, \n",
    "                      'keep_best': True,\n",
    "                      'verbose': False}\n",
    "pgd4_attack_params = advtrain.AdversarialAttackParameters(pgd4_attack, \n",
    "                                                          attack_specific_params=\n",
    "                                                          {'attack_kwargs': pgd4_attack_kwargs})\n",
    "\n",
    "# ------ PGD4 Block \n",
    "pgd8_attack = aa.PGD(model, normalizer, linf_8_threat, attack_loss)\n",
    "pgd8_attack_kwargs = {'step_size': 1.0 / 255.0, \n",
    "                      'num_iterations': 20, \n",
    "                      'keep_best': True,\n",
    "                      'verbose': False}\n",
    "pgd8_attack_params = advtrain.AdversarialAttackParameters(pgd4_attack, \n",
    "                                                          attack_specific_params=\n",
    "                                                          {'attack_kwargs': pgd8_attack_kwargs})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "'''\n",
    "Next we'll build the EvaluationResult objects that wrap these. \n",
    "And let's say we'll evaluate the:\n",
    "- top1 accuracy \n",
    "- average loss \n",
    "- average SSIM distance of successful perturbations [don't worry too much about this]\n",
    "\n",
    "The 'to_eval' dict as passed in the constructor has structure \n",
    " {key : <shorthand fxn>}\n",
    "where key is just a human-readable handle for what's being evaluated\n",
    "and shorthand_fxn is either a string for prebuilt evaluators, or you can pass in a general function to evaluate\n",
    "'''\n",
    "\n",
    "to_eval_dict = {'top1': 'top1', \n",
    "                'avg_loss_value': 'avg_loss_value', \n",
    "                'avg_successful_ssim': 'avg_successful_ssim'}\n",
    "\n",
    "fgsm8_eval = adveval.EvaluationResult(fgsm8_attack_params, \n",
    "                                      to_eval=to_eval_dict)\n",
    "\n",
    "\n",
    "pgd4_eval = adveval.EvaluationResult(pgd4_attack_params, \n",
    "                                     to_eval=to_eval_dict)\n",
    "\n",
    "pgd8_eval = adveval.EvaluationResult(pgd8_attack_params, \n",
    "                                     to_eval=to_eval_dict)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With our `EvaluationResult` objects built, all that remains is to collect all these into a dictionary and pass them to our `AdversarialEvaluation` object and interpret the result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Starting minibatch 0...\n",
      "\t (mb: 0) evaluating fgsm8...\n",
      "\t (mb: 0) evaluating pgd4...\n",
      "\t (mb: 0) evaluating pgd8...\n",
      "\t (mb: 0) evaluating ground...\n"
     ]
    }
   ],
   "source": [
    "attack_ensemble = {'fgsm8': fgsm8_eval, \n",
    "                   'pgd4' : pgd4_eval, \n",
    "                   'pgd8' : pgd8_eval\n",
    "                  }\n",
    "ensemble_out = adv_eval_object.evaluate_ensemble(cifar_valset, attack_ensemble, \n",
    "                                                 verbose=True, \n",
    "                                                 num_minibatches=1)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's take a look at the evaluation results. First notice that the key `'ground'` has been added to the ensemble output. This stores the top1 accuracy of unperturbed inputs (and thus the accuracy of the classifier).\n",
    "\n",
    "In general, the results of the evaluations will be stored in the `EvaluationResult.results` dictionary, with the keys being the same as the evaluation types desired. These generally will point to an `AverageMeter` object, which is a simple little object to keep track of average values. You can query its `.avg` value:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "dict_keys(['fgsm8', 'pgd4', 'pgd8', 'ground'])\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'top1': <utils.pytorch_utils.AverageMeter at 0x7f1ce5301048>,\n",
       " 'avg_loss_value': <utils.pytorch_utils.AverageMeter at 0x7f1ce5301128>,\n",
       " 'avg_successful_ssim': <utils.pytorch_utils.AverageMeter at 0x7f1ce5301160>}"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# First notice the keys of ensemble_out include ground:\n",
    "print(attack_ensemble.keys())\n",
    "\n",
    "attack_ensemble['pgd8'].results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Now let's build a little helper to print things out cleanly:\n",
    "\n",
    "sort_order = {'ground': 1, 'fgsm8': 2, 'pgd4': 3, 'pgd8': 4}\n",
    "def pretty_printer(eval_ensemble, result_type):\n",
    "    print('~' * 10, result_type, '~' * 10)\n",
    "    for key in sorted(list(eval_ensemble.keys()), key=lambda k: sort_order[k]):\n",
    "        eval_result = eval_ensemble[key]\n",
    "        pad = 6 - len(key)\n",
    "        if result_type not in eval_result.results:\n",
    "            continue \n",
    "        avg_result = eval_result.results[result_type].avg\n",
    "        print(key, pad* ' ', ': ', avg_result)\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "~~~~~~~~~~ top1 ~~~~~~~~~~\n",
      "ground  :  0.90625\n",
      "fgsm8   :  0.15625\n",
      "pgd4    :  0.0\n",
      "pgd8    :  0.0\n"
     ]
    }
   ],
   "source": [
    "'''And then we can print out and look at the results:\n",
    "This prints the accuracy. \n",
    "Ground is the unperturbed accuracy. \n",
    "If everything is done right, we should see that PGD with an l_inf bound of 4 is a stronger attack \n",
    "against undefended networks than FGSM with an l_inf bound of 8\n",
    "'''\n",
    "pretty_printer(ensemble_out, 'top1')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "~~~~~~~~~~ avg_loss_value ~~~~~~~~~~\n",
      "ground  :  0.3605864644050598\n",
      "fgsm8   :  9.063404083251953\n",
      "pgd4    :  42.08419418334961\n",
      "pgd8    :  42.08419418334961\n"
     ]
    }
   ],
   "source": [
    "# We can examine the loss (noting that we seek to 'maximize' loss in the adversarial example domain)\n",
    "pretty_printer(ensemble_out, 'avg_loss_value')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "~~~~~~~~~~ avg_successful_ssim ~~~~~~~~~~\n",
      "fgsm8   :  0.043325912643597204\n",
      "pgd4    :  0.007804851700970894\n",
      "pgd8    :  0.007804851700970894\n"
     ]
    }
   ],
   "source": [
    "# This is actually 1-SSIM, which can serve as a makeshift 'similarity index', \n",
    "# which essentially gives a meterstick for how similar the perturbed images are to the originals\n",
    "pretty_printer(ensemble_out, 'avg_successful_ssim')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# (Advanced): Custom Evaluation Techniques\n",
    "For most use cases, the predefined evalutions (accuracy, loss, etc) should be fine. Should one want to extend this, however, it's not too hard to do. We'll walk through an example where we evaluate the average l_inf distance of **successful** attacks. \n",
    "\n",
    "First we'll need to build a function that takes in an `EvaluationResult` object, a label and the tuple that is generated from the output of `AdversarialAttackParameters.attack(...)`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "def avg_successful_linf(self, eval_label, attack_out):\n",
    "    \n",
    "    # First set up the averageMeter to hold these results\n",
    "    if self.results[eval_label] is None:\n",
    "        self.results[eval_label] = utils.AverageMeter() \n",
    "    result = self.results[eval_label]\n",
    "    \n",
    "    # Collect the successful attacks only: \n",
    "    successful_pert, successful_orig = self._get_successful_attacks(attack_out)\n",
    "    \n",
    "    # Handle the degenerate case \n",
    "    if successful_pert is None or successful_pert.numel() == 0:\n",
    "        return \n",
    "    \n",
    "    # Compute the l_inf dist per example\n",
    "    batched_norms = utils.batchwise_norm(torch.abs(successful_pert - successful_orig), \n",
    "                                         'inf', dim=0)\n",
    "    # Update the result (and multiply by 255 for ease in exposition)\n",
    "    batch_avg = float(torch.sum(batched_norms)) / successful_pert.shape[0]\n",
    "    \n",
    "    result.update(batch_avg * 255, n=successful_pert.shape[0])\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Starting minibatch 0...\n",
      "\t (mb: 0) evaluating fgsm8...\n",
      "\t (mb: 0) evaluating pgd4...\n",
      "\t (mb: 0) evaluating pgd8...\n",
      "\t (mb: 0) evaluating ground...\n"
     ]
    }
   ],
   "source": [
    "# And now let's incorporate this into our to_eval_dict\n",
    "new_to_eval_dict = {'avg_successful_linf': avg_successful_linf}\n",
    "\n",
    "# And make some new EvaluationResult objects\n",
    "new_fgsm8_eval = adveval.EvaluationResult(fgsm8_attack_params, \n",
    "                                          to_eval=new_to_eval_dict)\n",
    "\n",
    "new_pgd4_eval = adveval.EvaluationResult(pgd4_attack_params, \n",
    "                                         to_eval=new_to_eval_dict)\n",
    "\n",
    "new_pgd8_eval = adveval.EvaluationResult(pgd8_attack_params, \n",
    "                                         to_eval=new_to_eval_dict)\n",
    "\n",
    "new_ensemble_in = {'fgsm8': new_fgsm8_eval, \n",
    "                   'pgd4': new_pgd4_eval, \n",
    "                   'pgd8': new_pgd8_eval}\n",
    "\n",
    "# And run through the evaluation \n",
    "new_ensemble_out = adv_eval_object.evaluate_ensemble(cifar_valset, new_ensemble_in,\n",
    "                                                     verbose=True,\n",
    "                                                     num_minibatches=1)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "~~~~~~~~~~ avg_successful_linf ~~~~~~~~~~\n",
      "fgsm8   :  8.00000585615635\n",
      "pgd4    :  4.000007935932705\n",
      "pgd8    :  4.000007935932705\n"
     ]
    }
   ],
   "source": [
    "# Finally we can take a look at the evaluation that we've monkeypatched in\n",
    "pretty_printer(new_ensemble_out, 'avg_successful_linf')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This concludes the tutorials for `mister_ed`. If there's anything that's confusing, or any features that you want supported that aren't ready out of the box, please feel free to open an issue on the main github repo and I'll do my best to catering to user requests. \n",
    "\n",
    "(also let me know about any bugs!)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
