{"title": "A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks", "book": "Advances in Neural Information Processing Systems", "page_first": 7167, "page_last": 7177, "abstract": "Detecting test samples drawn sufficiently far away from the training distribution statistically or adversarially is a fundamental requirement for deploying a good classifier in many real-world machine learning applications. However, deep neural networks with the softmax classifier are known to produce highly overconfident posterior distributions even for such abnormal samples. In this paper, we propose a simple yet effective method for detecting any abnormal samples, which is applicable to any pre-trained softmax neural classifier. We obtain the class conditional Gaussian distributions with respect to (low- and upper-level) features of the deep models under Gaussian discriminant analysis, which result in a confidence score based on the Mahalanobis distance. While most prior methods have been evaluated for detecting either out-of-distribution or adversarial samples, but not both, the proposed method achieves the state-of-the-art performances for both cases in our experiments. Moreover, we found that our proposed method is more robust in harsh cases, e.g., when the training dataset has noisy labels or small number of samples. Finally, we show that the proposed method enjoys broader usage by applying it to class-incremental learning: whenever out-of-distribution samples are detected, our classification rule can incorporate new classes well without further training deep models.", "full_text": "A Simple Uni\ufb01ed Framework for Detecting\n\nOut-of-Distribution Samples and Adversarial Attacks\n\nKimin Lee1, Kibok Lee2, Honglak Lee3,2, Jinwoo Shin1,4\n1Korea Advanced Institute of Science and Technology (KAIST)\n\n2University of Michigan\n\n3Google Brain\n\n4AItrics\n\nAbstract\n\nDetecting test samples drawn suf\ufb01ciently far away from the training distribution\nstatistically or adversarially is a fundamental requirement for deploying a good\nclassi\ufb01er in many real-world machine learning applications. However, deep neu-\nral networks with the softmax classi\ufb01er are known to produce highly overcon\ufb01dent\nposterior distributions even for such abnormal samples. In this paper, we propose\na simple yet effective method for detecting any abnormal samples, which is appli-\ncable to any pre-trained softmax neural classi\ufb01er. We obtain the class conditional\nGaussian distributions with respect to (low- and upper-level) features of the deep\nmodels under Gaussian discriminant analysis, which result in a con\ufb01dence score\nbased on the Mahalanobis distance. While most prior methods have been evalu-\nated for detecting either out-of-distribution or adversarial samples, but not both,\nthe proposed method achieves the state-of-the-art performances for both cases in\nour experiments. Moreover, we found that our proposed method is more robust\nin harsh cases, e.g., when the training dataset has noisy labels or small number of\nsamples. Finally, we show that the proposed method enjoys broader usage by ap-\nplying it to class-incremental learning: whenever out-of-distribution samples are\ndetected, our classi\ufb01cation rule can incorporate new classes well without further\ntraining deep models.\n\n1\n\nIntroduction\n\nDeep neural networks (DNNs) have achieved high accuracy on many classi\ufb01cation tasks, e.g.,\nspeech recognition [1], object detection [9] and image classi\ufb01cation [12]. However, measuring the\npredictive uncertainty still remains a challenging problem [20, 21]. Obtaining well-calibrated pre-\ndictive uncertainty is indispensable since it could be useful in many machine learning applications\n(e.g., active learning [8] and novelty detection [18]) as well as when deploying DNNs in real-world\nsystems [2], e.g., self-driving cars and secure authentication system [6, 30].\nThe predictive uncertainty of DNNs is closely related to the problem of detecting abnormal sam-\nples that are drawn far away from in-distribution (i.e., distribution of training samples) statistically\nor adversarially. For detecting out-of-distribution (OOD) samples, recent works have utilized the\ncon\ufb01dence from the posterior distribution [13, 21]. For example, Hendrycks & Gimpel [13] pro-\nposed the maximum value of posterior distribution from the classi\ufb01er as a baseline method, and it\nis improved by processing the input and output of DNNs [21]. For detecting adversarial samples,\ncon\ufb01dence scores were proposed based on density estimators to characterize them in feature spaces\nof DNNs [7]. More recently, Ma et al. [22] proposed the local intrinsic dimensionality (LID) and\nempirically showed that the characteristics of test samples can be estimated effectively using the\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00b4eal, Canada.\n\n\fLID. However, most prior works on this line typically do not evaluate both OOD and adversarial\nsamples. To best of our knowledge, no universal detector is known to work well on both tasks.\nContribution.\nIn this paper, we propose a simple yet effective method, which is applicable to\nany pre-trained softmax neural classi\ufb01er (without re-training) for detecting abnormal test samples\nincluding OOD and adversarial ones. Our high-level idea is to measure the probability density of test\nsample on feature spaces of DNNs utilizing the concept of a \u201cgenerative\u201d (distance-based) classi\ufb01er.\nSpeci\ufb01cally, we assume that pre-trained features can be \ufb01tted well by a class-conditional Gaussian\ndistribution since its posterior distribution can be shown to be equivalent to the softmax classi\ufb01er\nunder Gaussian discriminant analysis (see Section 2.1 for our justi\ufb01cation). Under this assumption,\nwe de\ufb01ne the con\ufb01dence score using the Mahalanobis distance with respect to the closest class-\nconditional distribution, where its parameters are chosen as empirical class means and tied empirical\ncovariance of training samples. To the contrary of conventional beliefs, we found that using the\ncorresponding generative classi\ufb01er does not sacri\ufb01ce the softmax classi\ufb01cation accuracy. Perhaps\nsurprisingly, its con\ufb01dence score outperforms softmax-based ones very strongly across multiple\nother tasks: detecting OOD samples, detecting adversarial samples and class-incremental learning.\nWe demonstrate the effectiveness of the proposed method using deep convolutional neural networks,\nsuch as DenseNet [14] and ResNet [12] trained for image classi\ufb01cation tasks on various datasets\nincluding CIFAR [15], SVHN [28], ImageNet [5] and LSUN [32]. First, for the problem of detecting\nOOD samples, the proposed method outperforms the current state-of-the-art method, ODIN [21], in\nall tested cases. In particular, compared to ODIN, our method improves the true negative rate (TNR),\ni.e., the fraction of detected OOD (e.g., LSUN) samples, from 45.6% to 90.9% on ResNet when\n95% of in-distribution (e.g., CIFAR-100) samples are correctly detected. Next, for the problem\nof detecting adversarial samples, e.g., generated by four attack methods such as FGSM [10], BIM\n[16], DeepFool [26] and CW [3], our method outperforms the state-of-the-art detection measure,\nLID [22]. In particular, compared to LID, ours improves the TNR of CW from 82.9% to 95.8% on\nResNet when 95% of normal CIFAR-10 samples are correctly detected.\nWe also found that our proposed method is more robust in the choice of its hyperparameters as well\nas against extreme scenarios, e.g., when the training dataset has some noisy, random labels or a\nsmall number of data samples. In particular, Liang et al. [21] tune the hyperparameters of ODIN\nusing validation sets of OOD samples, which is often impossible since the knowledge about OOD\nsamples is not accessible a priori. We show that hyperparameters of the proposed method can be\ntuned only using in-distribution (training) samples, while maintaining its performance. We further\nshow that the proposed method tuned on a simple attack, i.e., FGSM, can be used to detect other\nmore complex attacks such as BIM, DeepFool and CW.\nFinally, we apply our method to class-incremental learning [29]: new classes are added progressively\nto a pre-trained classi\ufb01er. Since the new class samples are drawn from an out-of-training distribution,\nit is natural to expect that one can classify them using our proposed metric without re-training the\ndeep models. Motivated by this, we present a simple method which accommodates a new class at\nany time by simply computing the class mean of the new class and updating the tied covariance of all\nclasses. We show that the proposed method outperforms other baseline methods, such as Euclidean\ndistance-based classi\ufb01er and re-trained softmax classi\ufb01er. This evidences that our approach have a\npotential to apply to many other related machine learning tasks, such as active learning [8], ensemble\nlearning [19] and few-shot learning [31].\n\n2 Mahalanobis distance-based score from generative classi\ufb01er\n\nGiven deep neural networks (DNNs) with the softmax classi\ufb01er, we propose a simple yet effective\nmethod for detecting abnormal samples such as out-of-distribution (OOD) and adversarial ones. We\n\ufb01rst present the proposed con\ufb01dence score based on an induced generative classi\ufb01er under Gaussian\ndiscriminant analysis (GDA), and then introduce additional techniques to improve its performance.\nWe also discuss how the con\ufb01dence score is applicable to incremental learning.\n\n2.1 Why Mahalanobis distance-based score?\nDerivation of generative classi\ufb01ers from softmax ones. Let x 2X be an input and y 2\nY = {1,\u00b7\u00b7\u00b7 , C} be its label. Suppose that a pre-trained softmax neural classi\ufb01er is given:\n\n2\n\n\f)\n\n%\n\n(\n \ny\nc\na\nr\nu\nc\nc\na\n\n \nt\n\ne\ns\n \nt\ns\ne\nT\n\n100\n\n90\n\n80\n\n70\n\nSoftmax\n\nMahalanobis\n\nCIFAR-10 CIFAR-100\nDatasets\n\nSVHN\n\n)\n0\n1\n-\nR\nA\nF\nC\n\nI\n\n(\n \n\nn\no\n\ni\nt\n\nu\nb\ni\nr\nt\ns\nd\n-\nn\n\ni\n\ni\n \n\n \n\nn\no\nR\nP\nT\n\n1.0\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n\n0\n\n0.2\n\n1.00\n0.95\n0.90\n0.85\n\n0.4\nSoftmax\nEuclidean\nMahalanobis \n\n0\n\n1.0\nFPR on out-of-distribution (TinyImageNet)\n\n0.5\n\n(a) Visualization by t-SNE\n\n(b) Classi\ufb01cation accuracy\n\n(c) ROC curve\n\nFigure 1: Experimental results under the ResNet with 34 layers. (a) Visualization of \ufb01nal features\nfrom ResNet trained on CIFAR-10 by t-SNE, where the colors of points indicate the classes of the\ncorresponding objects. (b) Classi\ufb01cation test set accuracy of ResNet on CIFAR-10, CIFAR-100 and\nSVHN datasets. (c) Receiver operating characteristic (ROC) curves: the x-axis and y-axis represent\nthe false positive rate (FPR) and true positive rate (TPR), respectively.\n\nexp(w>c f (x)+bc)\n\nPc0 exp(w>c0 f (x)+bc0)\n\n, where wc and bc are the weight and the bias of the soft-\nP (y = c|x) =\nmax classi\ufb01er for class c, and f (\u00b7) denotes the output of the penultimate layer of DNNs. Then,\nwithout any modi\ufb01cation on the pre-trained softmax neural classi\ufb01er, we obtain a generative clas-\nsi\ufb01er assuming that a class-conditional distribution follows the multivariate Gaussian distribu-\ntion. Speci\ufb01cally, we de\ufb01ne C class-conditional Gaussian distributions with a tied covariance \u2303:\nP (f (x)|y = c) = N (f (x)|\u00b5c, \u2303) , where \u00b5c is the mean of multivariate Gaussian distribution of\nclass c 2{ 1, ..., C}. Here, our approach is based on a simple theoretical connection between GDA\nand the softmax classi\ufb01er: the posterior distribution de\ufb01ned by the generative classi\ufb01er under GDA\nwith tied covariance assumption is equivalent to the softmax classi\ufb01er (see the supplementary mate-\nrial for more details). Therefore, the pre-trained features of the softmax neural classi\ufb01er f (x) might\nalso follow the class-conditional Gaussian distribution.\nTo estimate the parameters of the generative classi\ufb01er from the pre-trained softmax neural classi\ufb01er,\nwe compute the empirical class mean and covariance of training samples {(x1, y1), . . . , (xN , yN )}:\n(1)\n\n1\n\nNc Xi:yi=c\n\nf (xi), b\u2303 =\n\n1\n\nN Xc Xi:yi=c\n\nb\u00b5c =\n\n(f (xi) b\u00b5c) (f (xi) b\u00b5c)> ,\n\nwhere Nc is the number of training samples with label c. This is equivalent to \ufb01tting the class-\nconditional Gaussian distributions with a tied covariance to training samples under the maximum\nlikelihood estimator.\nMahalanobis distance-based con\ufb01dence score. Using the above induced class-conditional Gaus-\nsian distributions, we de\ufb01ne the con\ufb01dence score M (x) using the Mahalanobis distance between\ntest sample x and the closest class-conditional Gaussian distribution, i.e.,\n\n(2)\n\nM (x) = max\n\nc (f (x) b\u00b5c)>b\u23031 (f (x) b\u00b5c) .\n\nNote that this metric corresponds to measuring the log of the probability densities of the test sample.\nHere, we remark that abnormal samples can be characterized better in the representation space of\nDNNs, rather than the \u201clabel-over\ufb01tted\u201d output space of softmax-based posterior distribution used\nin the prior works [13, 21] for detecting them. It is because a con\ufb01dence measure obtained from the\nposterior distribution can show high con\ufb01dence even for abnormal samples that lie far away from\nthe softmax decision boundary. Feinman et al. [7] and Ma et al. [22] process the DNN features for\ndetecting adversarial samples in a sense, but do not utilize the Mahalanobis distance-based metric,\ni.e., they only utilize the Euclidean distance in their scores. In this paper, we show that Mahalanobis\ndistance is signi\ufb01cantly more effective than the Euclidean distance in various tasks.\nExperimental supports for generative classi\ufb01ers. To evaluate our hypothesis that trained features\nof DNNs support the assumption of GDA, we measure the classi\ufb01cation accuracy as follows:\n\nby(x) = arg min\n\nc\n\n(f (x) b\u00b5c)>b\u23031 (f (x) b\u00b5c) .\n\n3\n\n(3)\n\n\fAlgorithm 1 Computing the Mahalanobis distance-based con\ufb01dence score.\n\nInput: Test sample x, weights of logistic regression detector \u21b5`, noise \" and parameters of Gaus-\n\nInitialize score vectors: M(x) = [M` : 8`]\nfor each layer ` 2 1, . . . , L do\n\nsian distributions {b\u00b5`,c,b\u2303` : 8`, c}\nFind the closest class:bc = arg minc (f`(x) b\u00b5`,c)>b\u23031\n` (f`(x) b\u00b5`,c)\nAdd small noise to test sample:bx = x \"sign\u21e35x (f`(x) b\u00b5`,bc)>b\u23031\nreturn Con\ufb01dence score for test sampleP` \u21b5`M`\n\nc (f`(bx) b\u00b5`,c)>b\u23031\n\nComputing con\ufb01dence score: M` = max\n\nend for\n\n`\n\n`\n\n(f`(x) b\u00b5`,bc)\u2318\n\n(f`(bx) b\u00b5`,c)\n\n100\n\n)\n\n%\n\n(\n \n\nC\nO\nR\nU\nA\n\n90\n\n80\n\n70\n\n60\n\nIndex of basic block\n\n(a) TinyImageNet\n\n100\n\n)\n\n%\n\n(\n \n\nC\nO\nR\nU\nA\n\n90\n\n80\n\n70\n\n60\n\nIndex of basic block\n\n(b) LSUN\n\n)\n\n%\n\n(\n \n\nC\nO\nR\nU\nA\n\n100\n90\n80\n70\n60\n50\n40\n30\n\nIndex of basic block\n\n(c) SVHN\n\n)\n\n%\n\n(\n \n\nC\nO\nR\nU\nA\n\n100\n90\n80\n70\n60\n50\n40\n30\n\nIndex of basic block\n\n(d) DeepFool\n\nFigure 2: AUROC (%) of threshold-based detector using the con\ufb01dence score in (2) computed at\ndifferent basic blocks of DenseNet trained on CIFAR-10 dataset. We measure the detection perfor-\nmance using (a) TinyImageNet, (b) LSUN, (c) SVHN and (d) adversarial (DeepFool) samples.\n\nWe remark that this corresponds to predicting a class label using the posterior distribution from gen-\nerative classi\ufb01er with the uniform class prior. Interestingly, we found that the softmax accuracy (red\nbar) is also achieved by the Mahalanobis distance-based classi\ufb01er (blue bar), while conventional\nknowledge is that a generative classi\ufb01er trained from scratch typically performs much worse than a\ndiscriminative classi\ufb01er such as softmax. For visual interpretation, Figure 1(a) presents embeddings\nof \ufb01nal features from CIFAR-10 test samples constructed by t-SNE [23], where the colors of points\nindicate the classes of the corresponding objects. One can observe that all ten classes are clearly\nseparated in the embedding space, which supports our intuition.\nIn addition, we also show that\nMahalanobis distance-based metric can be very useful in detecting out-of-distribution samples. For\nevaluation, we obtain the receiver operating characteristic (ROC) curve using a simple threshold-\nbased detector by computing the con\ufb01dence score M (x) on a test sample x and decide it as positive\n(i.e., in-distribution) if M (x) is above some threshold. The Euclidean distance, which only utilizes\nthe empirical class means, is considered for comparison. We train ResNet on CIFAR-10, and Tiny-\nImageNet dataset [5] is used for an out-of-distribution. As shown in Figure 1(c), the Mahalanobis\ndistance-based metric (blue bar) performs better than Euclidean one (green bar) and the maximum\nvalue of the softmax distribution (red bar).\n\n2.2 Calibration techniques\nInput pre-processing. To make in- and out-of-distribution samples more separable, we consider\nadding a small controlled noise to a test sample. Speci\ufb01cally, for each test sample x, we calculate\n\nthe pre-processed samplebx by adding the small perturbations as follows:\nwhere \" is a magnitude of noise andbc is the index of the closest class. Next, we measure the con\ufb01-\n\nbx = x + \"sign (5xM (x)) = x \"sign\u21e35x (f (x) b\u00b5bc)>b\u23031 (f (x) b\u00b5bc)\u2318 ,\n\ndence score using the pre-processed sample. We remark that the noise is generated to increase the\nproposed con\ufb01dence score (2) unlike adversarial attacks [10]. In our experiments, such perturba-\ntion can have stronger effect on separating the in- and out-of-distribution samples. We remark that\nsimilar input pre-processing was studied in [21], where the perturbations are added to increase the\nsoftmax score of the predicted label. However, our method is different in that the noise is generated\nto increase the proposed metric.\n\n(4)\n\n4\n\n\fAlgorithm 2 Updating Mahalanobis distance-based classi\ufb01er for class-incremental learning.\n\nInput: set of samples from a new class {xi : 8i = 1 . . . NC+1}, mean and covariance of observed\nclasses {b\u00b5c : 8c = 1 . . . C},b\u2303\nNC+1Pi f (xi)\nCompute the new class mean:b\u00b5C+1 1\nCompute the covariance of the new class: b\u2303C+1 1\nUpdate the shared covariance: b\u2303 C\nC+1b\u2303 + 1\nC+1b\u2303C+1\nreturn Mean and covariance of all classes {b\u00b5c : 8c = 1 . . . C + 1},b\u2303\n\nNC+1Pi(f (xi) b\u00b5C+1)(f (xi) b\u00b5C+1)>\n\nFeature ensemble. To further improve the performance, we consider measuring and combining the\ncon\ufb01dence scores from not only the \ufb01nal features but also the other low-level features in DNNs.\nFormally, given training data, we extract the `-th hidden features of DNNs, denoted by f`(x), and\n\ncompute their empirical class means and tied covariances, i.e., b\u00b5`,c and b\u2303`. Then, for each test\n\nsample x, we measure the con\ufb01dence score from the `-th layer using the formula in (2). One can\nexpect that this simple but natural scheme can bring an extra gain in obtaining a better calibrated\nscore by extracting more input-speci\ufb01c information from the low-level features. We measure the\narea under ROC (AUROC) curves of the threshold-based detector using the con\ufb01dence score in\n(2) computed at different basic blocks of DenseNet [14] trained on CIFAR-10 dataset, where the\noverall trends on ResNet are similar. Figure 2 shows the performance on various OOD samples such\nas SVHN [28], LSUN [32], TinyImageNet and adversarial samples generated by DeepFool [26],\nwhere the dimensions of the intermediate features are reduced using average pooling (see Section\n3 for more details). As shown in Figure 2, the con\ufb01dence scores computed at low-level features\noften provide better calibrated ones compared to \ufb01nal features (e.g., LSUN, TinyImageNet and\nDeepFool). To further improve the performance, we design a feature ensemble method as described\nin Algorithm 1. We \ufb01rst extract the con\ufb01dence scores from all layers, and then integrate them by\n\nweighted averaging: P` \u21b5`M`(x), where M`(\u00b7) and \u21b5` is the con\ufb01dence score at the `-th layer\n\nand its weight, respectively. In our experiments, following similar strategies in [22], we choose\nthe weight of each layer \u21b5` by training a logistic regression detector using validation samples. We\nremark that such weighted averaging of con\ufb01dence scores can prevent the degradation on the overall\nperformance even in the case when the con\ufb01dence scores from some layers are not effective: the\ntrained weights (using validation) would be nearly zero for those ineffective layers.\n\n2.3 Class-incremental learning using Mahalanobis distance-based score\n\nAs a natural extension, we also show that the Mahalanobis distance-based con\ufb01dence score can be\nutilized in class-incremental learning tasks [29]: a classi\ufb01er pre-trained on base classes is progres-\nsively updated whenever a new class with corresponding samples occurs. This task is known to be\nchallenging since one has to deal with catastrophic forgetting [24] with a limited memory. To this\nend, recent works have been toward developing new training methods which involve a generative\nmodel or data sampling, but adopting such training methods might incur expensive back-and-forth\ncosts. Based on the proposed con\ufb01dence score, we develop a simple classi\ufb01cation method without\nthe usage of complicated training methods. To do this, we \ufb01rst assume that the classi\ufb01er is well\npre-trained with a certain amount of base classes, where the assumption is quite reasonable in many\npractical scenarios.1 In this case, one can expect that not only the classi\ufb01er can detect OOD samples\nwell, but also might be good for discriminating new classes, as the representation learned with the\nbase classes can characterize new ones. Motivated by this, we present a Mahalanobis distance-based\nclassi\ufb01er based on (3), which tries to accommodate a new class by simply computing and updating\nthe class mean and covariance, as described in Algorithm 2. The class-incremental adaptation of our\ncon\ufb01dence score shows its potential to be applied to a wide range of new applications in the future.\n\n1For example, state-of-the-art CNNs trained on large-scale image dataset are off-the-shelf [12, 14], so they\n\nare a starting point in many computer vision tasks [9, 18, 25].\n\n5\n\n\fTNR\n\nMethod\n\nBaseline [13]\nODIN [21]\n\nMahalanobis\n\n(ours)\n\nFeature\nensemble\n\nInput\n\npre-processing\n\n-\n-\n-\n-\nX\nX\n\n-\n-\n-\nX\n-\nX\n\nat TPR 95% AUROC Detection\naccuracy\n85.06\n91.08\n89.13\n93.72\n93.55\n95.75\n\n32.47\n86.55\n54.51\n92.26\n91.45\n96.42\n\n89.88\n96.65\n93.92\n98.30\n98.37\n99.14\n\nAUPR\n\nin\n\n85.40\n92.54\n91.56\n96.01\n96.43\n98.26\n\nAUPR\n\nout\n93.96\n98.52\n95.95\n99.28\n99.35\n99.60\n\nTable 1: Contribution of each proposed method on distinguishing in- and out-of-distribution test\nset data. We measure the detection performance using ResNet trained on CIFAR-10, when SVHN\ndataset is used as OOD. All values are percentages and the best results are indicated in bold.\n\n3 Experimental results\n\nIn this section, we demonstrate the effectiveness of the proposed method using deep convolutional\nneural networks such as DenseNet [14] and ResNet [12] on various vision datasets: CIFAR [15],\nSVHN [28], ImageNet [5] and LSUN [32]. Due to the space limitation, we provide the more detailed\nexperimental setups and results in the supplementary material. Our code is available at https:\n//github.com/pokaxpoka/deep_Mahalanobis_detector.\n\n3.1 Detecting out-of-distribution samples\n\nSetup. For the problem of detecting out-of-distribution (OOD) samples, we train DenseNet with 100\nlayers and ResNet with 34 layers for classifying CIFAR-10, CIFAR-100 and SVHN datasets. The\ndataset used in training is the in-distribution (positive) dataset and the others are considered as OOD\n(negative). We only use test datasets for evaluation. In addition, the TinyImageNet (i.e., subset of\nImageNet dataset) and LSUN datasets are also tested as OOD. For evaluation, we use a threshold-\nbased detector which measures some con\ufb01dence score of the test sample, and then classi\ufb01es the\ntest sample as in-distribution if the con\ufb01dence score is above some threshold. We measure the\nfollowing metrics: the true negative rate (TNR) at 95% true positive rate (TPR), the area under the\nreceiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPR),\nand the detection accuracy. For comparison, we consider the baseline method [13], which de\ufb01nes\na con\ufb01dence score as a maximum value of the posterior distribution, and the state-of-the-art ODIN\n[21], which de\ufb01nes the con\ufb01dence score as a maximum value of the processed posterior distribution.\nFor our method, we extract the con\ufb01dence scores from every end of dense (or residual) block of\nDenseNet (or ResNet). The size of feature maps on each convolutional layers is reduced by average\npooling for computational ef\ufb01ciency: F\u21e5H\u21e5W!F\u21e5\n1, where F is the number of channels\nand H\u21e5W is the spatial dimension. As shown in Algorithm 1, the output of the logistic regres-\nsion detector is used as the \ufb01nal con\ufb01dence score in our case. All hyperparameters are tuned on a\nseparate validation set, which consists of 1,000 images from each in- and out-of-distribution pair.\nSimilar to Ma et al. [22], the weights of logistic regression detector are trained using nested cross\nvalidation within the validation set, where the class label is assigned positive for in-distribution sam-\nples and assigned negative for OOD samples. Since one might not have OOD validation datasets in\npractice, we also consider tuning the hyperparameters using in-distribution (positive) samples and\ncorresponding adversarial (negative) samples generated by FGSM [10].\nContribution by each technique and comparison with ODIN. Table 1 validates the contributions\nof our suggested techniques under the comparison with the baseline method and ODIN. We measure\nthe detection performance using ResNet trained on CIFAR-10, when SVHN dataset is used as OOD.\nWe incrementally apply our techniques to see the stepwise improvement by each component. One\ncan note that our method signi\ufb01cantly outperforms the baseline method without feature ensembles\nand input pre-processing. This implies that our method can characterize the OOD samples very\neffectively compared to the posterior distribution. By utilizing the feature ensemble and input pre-\nprocessing, the detection performance are further improved compared to that of ODIN. The left-hand\ncolumn of Table 2 reports the detection performance with ODIN for all in- and out-of-distribution\n\n6\n\n\fIn-dist\n(model)\n\nCIFAR-10\n(DenseNet)\n\nCIFAR-100\n(DenseNet)\n\nSVHN\n\n(DenseNet)\n\nCIFAR-10\n(ResNet)\n\nCIFAR-100\n(ResNet)\n\nSVHN\n(ResNet)\n\nOOD\n\nTNR at TPR 95%\n\nAUROC\n\nDetection acc.\n\nTNR at TPR 95%\n\nAUROC\n\nDetection acc.\n\nBaseline [13] / ODIN [21] / Mahalanobis (ours)\n\nBaseline [13] / ODIN [21] / Mahalanobis (ours)\n\nValidation on OOD samples\n\nValidation on adversarial samples\n\nTinyImageNet\n\nTinyImageNet\n\nCIFAR-10\n\nTinyImageNet\n\nSVHN\n\nLSUN\nSVHN\n\nLSUN\n\nLSUN\nSVHN\n\nLSUN\nSVHN\n\nLSUN\n\nTinyImageNet\n\nTinyImageNet\n\nCIFAR-10\n\nTinyImageNet\n\nLSUN\n\n40.2 / 86.2 / 90.8\n58.9 / 92.4 / 95.0\n66.6 / 96.2 / 97.2\n26.7 / 70.6 / 82.5\n17.6 / 42.6 / 86.6\n16.7 / 41.2 / 91.4\n69.3 / 71.7 / 96.8\n79.8 / 84.1 / 99.9\n77.1 / 81.1 / 100\n32.5 / 86.6 / 96.4\n44.7 / 72.5 / 97.1\n45.4 / 73.8 / 98.9\n20.3 / 62.7 / 91.9\n20.4 / 49.2 / 90.9\n18.8 / 45.6 / 90.9\n78.3 / 79.8 / 98.4\n79.0 / 82.1 / 99.9\n74.3 / 77.3 / 99.9\n\n89.9 / 95.5 / 98.1\n94.1 / 98.5 / 98.8\n95.4 / 99.2 / 99.3\n82.7 / 93.8 / 97.2\n71.7 / 85.2 / 97.4\n70.8 / 85.5 / 98.0\n91.9 / 91.4 / 98.9\n94.8 / 95.1 / 99.9\n94.1 / 94.5 / 99.9\n89.9 / 96.7 / 99.1\n91.0 / 94.0 / 99.5\n91.0 / 94.1 / 99.7\n79.5 / 93.9 / 98.4\n77.2 / 87.6 / 98.2\n75.8 / 85.6 / 98.2\n92.9 / 92.1 / 99.3\n93.5 / 92.0 / 99.9\n91.6 / 89.4 / 99.9\n\n83.2 / 91.4 / 93.9\n88.5 / 93.9 / 95.0\n90.3 / 95.7 / 96.3\n75.6 / 86.6 / 91.5\n65.7 / 77.0 / 92.2\n64.9 / 77.1 / 93.9\n86.6 / 85.8 / 95.9\n90.2 / 90.4 / 98.9\n89.1 / 89.2 / 99.3\n85.1 / 91.1 / 95.8\n85.1 / 86.5 / 96.3\n85.3 / 86.7 / 97.7\n73.2 / 88.0 / 93.7\n70.8 / 80.1 / 93.3\n69.9 / 78.3 / 93.5\n90.0 / 89.4 / 96.9\n90.4 / 89.4 / 99.1\n89.0 / 87.2 / 99.5\n\n40.2 / 70.5 / 89.6\n58.9 / 87.1 / 94.9\n66.6 / 92.9 / 97.2\n26.7 / 39.8 / 62.2\n17.6 / 43.2 / 87.2\n16.7 / 42.1 / 91.4\n69.3 / 69.3 / 97.5\n79.8 / 79.8 / 99.9\n77.1 / 77.1 / 100\n32.5 / 40.3 / 75.8\n44.7 / 69.6 / 95.5\n45.4 / 70.0 / 98.1\n20.3 / 12.2 / 41.9\n20.4 / 33.5 / 70.3\n18.8 / 31.6 / 56.6\n78.3 / 79.8 / 94.1\n79.0 / 80.5 / 99.2\n74.3 / 76.3 / 99.9\n\n89.9 / 92.8 / 97.6\n94.1 / 97.2 / 98.8\n95.4 / 98.5 / 99.2\n82.7 / 88.2 / 91.8\n71.7 / 85.3 / 97.0\n70.8 / 85.7 / 97.9\n91.9 / 91.9 / 98.8\n94.8 / 94.8 / 99.8\n94.1 / 94.1 / 99.9\n89.9 / 86.5 / 95.5\n91.0 / 93.9 / 99.0\n91.0 / 93.7 / 99.5\n79.5 / 72.0 / 84.4\n77.2 / 83.6 / 87.9\n75.8 / 81.9 / 82.3\n92.9 / 92.1 / 97.6\n93.5 / 92.9 / 99.3\n91.6 / 90.7 / 99.9\n\n83.2 / 86.5 / 92.6\n88.5 / 92.1 / 95.0\n90.3 / 94.3 / 96.2\n75.6 / 80.7 / 84.6\n65.7 / 77.2 / 91.8\n64.9 / 77.3 / 93.8\n86.6 / 86.6 / 96.3\n90.2 / 90.2 / 98.9\n89.1 / 89.1 / 99.2\n85.1 / 77.8 / 89.1\n85.1 / 86.0 / 95.4\n85.3 / 85.8 / 97.2\n73.2 / 67.7 / 76.5\n70.8 / 75.9 / 84.6\n69.9 / 74.6 / 79.7\n90.0 / 89.4 / 94.6\n90.4 / 90.1 / 98.8\n89.0 / 88.2 / 99.5\n\nTable 2: Distinguishing in- and out-of-distribution test set data for image classi\ufb01cation under various\nvalidation setups. All values are percentages and the best results are indicated in bold.\n\n100\n\n90\n\n80\n\n70\n\n60\n\nOut-of-distribution: SVHN\n\nBaseline\nODIN\nMahalanobis\n5K 10K 20K 30K 40K 50K\n\n100\n\n90\n\n80\n\n70\n\n60\n\nOut-of-distribution: TinyImageNet\n\nBaseline\nODIN\nMahalanobis\n5K 10K 20K 30K 40K 50K\n\n100\n\n90\n\n80\n\n70\n\n60\n\nOut-of-distribution: SVHN\n\nBaseline\nODIN\nMahalanobis\n\n0% 10% 20% 30% 40%\n\n100\n\n90\n\n80\n\n70\n\n60\n\nOut-of-distribution: TinyImageNet\n\nBaseline\nODIN\nMahalanobis\n\n0% 10% 20% 30% 40%\n\n(a) Small number of training data\n\n(b) Training data with random labels\n\nFigure 3: Comparison of AUROC (%) under extreme scenarios: (a) small number of training data,\nwhere the x-axis represents the number of training data. (b) Random label is assigned to training\ndata, where the x-axis represents the percentage of training data with random label.\n\ndataset pairs. Our method outperforms the baseline and ODIN for all tested cases. In particular,\nour method improves the TNR, i.e., the fraction of detected LSUN samples, compared to ODIN:\n41.2% ! 91.4% using DenseNet, when 95% of CIFAR-100 samples are correctly detected.\nComparison of robustness. In order to evaluate the robustness of our method, we measure the\ndetection performance when all hyperparameters are tuned only using in-distribution and adversarial\nsamples generated by FGSM [10]. As shown in the right-hand column of Table 2, ODIN is working\npoorly compared to the baseline method in some cases (e.g., DenseNet trained on SVHN), while our\nmethod still outperforms the baseline and ODIN consistently. We remark that our method validated\nwithout OOD but adversarial samples even outperforms ODIN validated with OOD. We also verify\nthe robustness of our method under various training setups. Since our method utilizes empirical\nclass mean and covariance of training samples, there is a caveat such that it can be affected by the\nproperties of training data. In order to verify the robustness, we measure the detection performance\nwhen we train ResNet by varying the number of training data and assigning random label to training\ndata on CIFAR-10 dataset. As shown in Figure 3, our method (blue bar) maintains high detection\nperformances even for small number of training data or noisy one, while baseline (red bar) and ODIN\n(yellow bar) do not. Finally, we remark that our method using softmax neural classi\ufb01er trained by\nstandard cross entropy loss typically outperforms the ODIN using softmax neural classi\ufb01er trained\nby con\ufb01dence loss [20] which involves jointly training a generator and a classi\ufb01er to calibrate the\nposterior distribution even though training such model is computationally more expensive (see the\nsupplementary material for more details).\n\n3.2 Detecting adversarial samples\n\nSetup. For the problem of detecting adversarial samples, we train DenseNet and ResNet for classi-\nfying CIFAR-10, CIFAR-100 and SVHN datasets, and the corresponding test dataset is used as the\n\n7\n\n\fModel\n\nDataset\n(model)\n\nCIFAR-10\n\nDenseNet\n\nCIFAR-100\n\nSVHN\n\nCIFAR-10\n\nResNet\n\nCIFAR-100\n\nSVHN\n\nMahalanobis (ours)\n\nMahalanobis (ours)\n\nMahalanobis (ours)\n\nScore\n\nKD+PU [7]\nLID [22]\n\nKD+PU [7]\nLID [22]\n\nKD+PU [7]\nLID [22]\n\nKD+PU [7]\nLID [22]\n\nKD+PU [7]\nLID [22]\n\nKD+PU [7]\nLID [22]\n\nMahalanobis (ours)\n\nMahalanobis (ours)\n\nMahalanobis (ours)\n\nDetection of known attack\n\nFGSM BIM DeepFool\n85.96\n98.20\n99.94\n90.13\n99.35\n99.86\n86.95\n99.35\n99.85\n81.21\n99.69\n99.94\n89.90\n98.73\n99.77\n82.67\n97.86\n99.62\n\n68.05\n85.14\n83.41\n68.29\n70.17\n77.57\n89.51\n91.79\n95.10\n81.07\n88.51\n91.57\n80.22\n71.95\n85.26\n89.71\n92.40\n95.73\n\n96.80\n99.74\n99.78\n89.69\n98.17\n99.17\n82.06\n94.87\n99.28\n82.28\n96.28\n99.57\n83.67\n96.89\n96.90\n66.19\n90.74\n97.15\n\nCW FGSM (seen)\n58.72\n80.05\n87.31\n57.51\n73.37\n87.05\n85.68\n94.70\n97.03\n55.93\n82.23\n95.84\n77.37\n78.67\n91.77\n76.57\n88.24\n92.15\n\nDetection of unknown attack\nBIM DeepFool\n3.10\n94.55\n99.51\n66.86\n68.62\n98.27\n83.28\n92.21\n99.12\n16.16\n95.38\n98.91\n68.85\n55.82\n96.38\n43.21\n84.88\n95.39\n\n85.96\n98.20\n99.94\n90.13\n99.35\n99.86\n86.95\n99.35\n99.85\n83.51\n99.69\n99.94\n89.90\n98.73\n99.77\n82.67\n97.86\n99.62\n\n68.34\n70.86\n83.42\n65.30\n69.68\n75.63\n84.38\n80.14\n93.47\n76.80\n71.86\n78.06\n57.78\n63.15\n81.95\n84.30\n67.28\n72.20\n\nCW\n53.21\n71.50\n87.95\n58.08\n72.36\n86.20\n82.94\n85.09\n96.95\n56.30\n77.53\n93.90\n73.72\n75.03\n90.96\n67.85\n76.58\n86.73\n\nTable 3: Comparison of AUROC (%) under various validation setups. For evaluation on unknown\nattack, FGSM samples denoted by \u201cseen\u201d are used for validation. For our method, we use both\nfeature ensemble and input pre-processing. The best results are indicated in bold.\n\npositive samples to measure the performance. We use adversarial images as the negative samples\ngenerated by the following attack methods: FGSM [10], BIM [16], DeepFool [26] and CW [3],\nwhere the detailed explanations can be found in the supplementary material. For comparison, we\nuse a logistic regression detector based on combinations of kernel density (KD) [7] and predictive\nuncertainty (PU), i.e., maximum value of posterior distribution. We also compare the state-of-the-\nart local intrinsic dimensionality (LID) scores [22]. Following the similar strategies in [7, 22], we\nrandomly choose 10% of original test samples for training the logistic regression detectors and the\nremaining test samples are used for evaluation. Using nested cross-validation within the training set,\nall hyper-parameters are tuned.\nComparison with LID and generalization analysis. The left-hand column of Table 3 reports the\nAUROC score of a logistic regression detectors for all normal and adversarial pairs. One can note\nthat the proposed method outperforms all tested methods in most cases. In particular, ours improves\nthe AUROC of LID from 82.2% to 95.8% when we detect CW samples using ResNet trained on\nthe CIFAR-10 dataset. Similar to [22], we also evaluate whether the proposed method is tuned on\na simple attack can be generalized to detect other more complex attacks. To this end, we measure\nthe detection performance when we train the logistic regression detector using samples generated by\nFGSM. As shown in the right-hand column of Table 3, our method trained on FGSM can accurately\ndetect much more complex attacks such as BIM, DeepFool and CW. Even though LID can also\ngeneralize well, our method still outperforms it in most cases. A natural question that arises is\nwhether the LID can be useful in detecting OOD samples. We indeed compare the performance of\nour method with that of LID in the supplementary material, where our method still outperforms LID\nin all tested case.\n\n3.3 Class-incremental learning\n\nSetup. For the task of class-incremental learning, we train ResNet with 34 layers for classifying\nCIFAR-100 and downsampled ImageNet [4]. As described in Section 2.3, we assume that a classi\ufb01er\nis pre-trained on a certain amount of base classes and new classes with corresponding datasets are\nincrementally provided one by one. Speci\ufb01cally, we test two different scenarios: in the \ufb01rst scenario,\nhalf of CIFAR-100 classes are bases classes and the rest are new classes. In the second scenario,\nall classes in CIFAR-100 are considered to be base classes and 100 of ImageNet classes are new\nclasses. All scenarios are tested \ufb01ve times and then averaged. Class splits are randomly generated\nfor each trial. For comparison, we consider a softmax classi\ufb01er, which is \ufb01ne-tuned whenever new\nclass data come in, and a Euclidean classi\ufb01er [25], which tries to accommodate a new class by only\ncomputing the class mean. For the softmax classi\ufb01er, we only update the softmax layer to achieve\nnear-zero cost training [25], and follow the memory management in Rebuf\ufb01 & Kolesnikov [29]: a\nsmall number of samples from old classes are kept in the limited memory, where the size of the\n\n8\n\n\f)\n\n%\n\n(\n \n\nC\nU\nA\n\n80\n\n70\n\n60\n\n50\n\n40\n\n30\n\nSoftmax\nEuclidean\nMahalanobis (ours)\n\n50\n\n60\n\n70\n\n80\n\n90\n\n100\n\nThe number of classes\n\n)\n\n%\n\n(\n \ny\nc\na\nr\nu\nc\nc\na\n \ns\ns\na\nc\n \nw\ne\nN\n\nl\n\n60\n\n50\n\n40\n\n30\n\n20\n\n10\n\n0\n\n0\n\nSoftmax\nEuclidean\nMahalanobis (ours)\n\n80\n\n60\n\n40\n\n20\n\n)\n\n%\n\n(\n \n\nC\nU\nA\n\nSoftmax\nEuclidean\nMahalanobis (ours)\n\n20\n60\nBase class accuracy (%)\n\n40\n\n80\n\n100\n\n120\n\n140\n\n160\n\n180\n\n200\n\nThe number of classes\n\n30\n\n20\n\n10\n\n)\n\n%\n\n(\n \ny\nc\na\nr\nu\nc\nc\na\n \ns\ns\na\nc\n \nw\ne\nN\n\nl\n\n0\n\n0\n\nSoftmax\nEuclidean\nMahalanobis (ours)\n\n20\n60\nBase class accuracy (%)\n\n40\n\n80\n\n(a) Base: half of CIFAR-100 / New: the other half\n\n(b) Base: CIFAR-100 / New: ImageNet\n\nFigure 4: Experimental results of class-incremental learning on CIFAR-100 and ImageNet datasets.\nIn each experiment, we report (left) AUC with respect to the number of learned classes and, (right)\nthe base-new class accuracy curve after the last new classes is added.\n\nmemory is matched with that for keeping the parameters for Mahalanobis distance-based classi\ufb01er.\nNamely, the number of old exemplars kept for training the softmax classi\ufb01er is chosen as the sum of\nthe number of learned classes and the dimension (512 in our experiments) of the hidden features. For\nevaluation, similar to [18], we \ufb01rst draw base-new class accuracy curve by adjusting an additional\nbias to the new class scores, and measure the area under curve (AUC) since averaging base and new\nclass accuracy may cause an imbalanced measure of the performance between base and new classes.\nComparison with other classi\ufb01ers. Figure 4 compares the incremental learning performance of\nmethods in terms of AUC in the two scenarios mentioned above. In each sub-\ufb01gure, AUC with re-\nspect to the number of learned classes (left) and the base-new class accuracy curve after the last new\nclasses is added (right) are drawn. Our proposed Mahalanobis distance-based classi\ufb01er outperforms\nthe other methods by a signi\ufb01cant margin, as the number of new classes increases, although there\nis a crossing in the right \ufb01gure of Figure 4(b) in small regimes (due to the catastrophic forgetting\nissue). In particular, the AUC of our proposed method is 40.0% (22.1%), which is better than 32.7%\n(15.6%) of the softmax classi\ufb01er and 32.9% (17.1%) of the Euclidean distance classi\ufb01er after all\nnew classes are added in the \ufb01rst (second) experiment. We also report the experimental results in\nthe supplementary material for the case when classes of CIFAR-100 are base classes and those of\nCIFAR-10 are new classes, where the overall trend is similar. The experimental results additionally\ndemonstrate the superiority of our con\ufb01dence score, compared to other plausible ones.\n\n4 Conclusion\n\nIn this paper, we propose a simple yet effective method for detecting abnormal test samples including\nboth out-of-distribution and adversarial ones. In essence, our main idea is inducing a generative\nclassi\ufb01er under LDA assumption, and de\ufb01ning new con\ufb01dence score based on it. With calibration\ntechniques such as input pre-processing and feature ensemble, our method performs very strongly\nacross multiple tasks: detecting out-of-distribution samples, detecting adversarial attacks and class-\nincremental learning. We also found that our proposed method is more robust in the choice of its\nhyperparameters as well as against extreme scenarios, e.g., when the training dataset has some noisy,\nrandom labels or a small number of data samples. We believe that our approach have a potential to\napply to many other related machine learning tasks, e.g., active learning [8], ensemble learning [19]\nand few-shot learning [31].\n\nAcknowledgements\n\nThis work was supported in part by Institute for Information & communications Technology Pro-\nmotion (IITP) grant funded by the Korea government (MSIT) (No.R0132-15-1005, Content visual\nbrowsing technology in the online and of\ufb02ine environments), National Research Council of Science\n& Technology (NST) grant by the Korea government (MSIP) (No. CRC-15-05-ETRI), DARPA\nExplainable AI (XAI) program #313498, Sloan Research Fellowship, and Kwanjeong Educational\nFoundation Scholarship.\n\n9\n\n\fReferences\n[1] Amodei, Dario, Ananthanarayanan, Sundaram, Anubhai, Rishita, Bai, Jingliang, Battenberg,\nEric, Case, Carl, Casper, Jared, Catanzaro, Bryan, Cheng, Qiang, Chen, Guoliang, et al. Deep\nspeech 2: End-to-end speech recognition in english and mandarin. In ICML, 2016.\n\n[2] Amodei, Dario, Olah, Chris, Steinhardt, Jacob, Christiano, Paul, Schulman, John, and Man\u00b4e,\n\nDan. Concrete problems in ai safety. arXiv preprint arXiv:1606.06565, 2016.\n\n[3] Carlini, Nicholas and Wagner, David. Adversarial examples are not easily detected: Bypassing\n\nten detection methods. In ACM workshop on AISec, 2017.\n\n[4] Chrabaszcz, Patryk, Loshchilov, Ilya, and Hutter, Frank. A downsampled variant of imagenet\n\nas an alternative to the cifar datasets. arXiv preprint arXiv:1707.08819, 2017.\n\n[5] Deng, Jia, Dong, Wei, Socher, Richard, Li, Li-Jia, Li, Kai, and Fei-Fei, Li.\n\nlarge-scale hierarchical image database. In CVPR, 2009.\n\nImagenet: A\n\n[6] Evtimov, Ivan, Eykholt, Kevin, Fernandes, Earlence, Kohno, Tadayoshi, Li, Bo, Prakash, Atul,\nRahmati, Amir, and Song, Dawn. Robust physical-world attacks on machine learning models.\nIn CVPR, 2018.\n\n[7] Feinman, Reuben, Curtin, Ryan R, Shintre, Saurabh, and Gardner, Andrew B. Detecting ad-\n\nversarial samples from artifacts. arXiv preprint arXiv:1703.00410, 2017.\n\n[8] Gal, Yarin, Islam, Riashat, and Ghahramani, Zoubin. Deep bayesian active learning with image\n\ndata. In ICML, 2017.\n\n[9] Girshick, Ross. Fast r-cnn. In ICCV, 2015.\n\n[10] Goodfellow, Ian J, Shlens, Jonathon, and Szegedy, Christian. Explaining and harnessing ad-\n\nversarial examples. In ICLR, 2015.\n\n[11] Guo, Chuan, Rana, Mayank, Ciss\u00b4e, Moustapha, and van der Maaten, Laurens. Countering\n\nadversarial images using input transformations. arXiv preprint arXiv:1711.00117, 2017.\n\n[12] He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun, Jian. Deep residual learning for image\n\nrecognition. In CVPR, 2016.\n\n[13] Hendrycks, Dan and Gimpel, Kevin. A baseline for detecting misclassi\ufb01ed and out-of-\n\ndistribution examples in neural networks. In ICLR, 2017.\n\n[14] Huang, Gao and Liu, Zhuang. Densely connected convolutional networks. In CVPR, 2017.\n\n[15] Krizhevsky, Alex and Hinton, Geoffrey. Learning multiple layers of features from tiny images.\n\n2009.\n\n[16] Kurakin, Alexey, Goodfellow, Ian, and Bengio, Samy. Adversarial examples in the physical\n\nworld. arXiv preprint arXiv:1607.02533, 2016.\n\n[17] Lasserre, Julia A, Bishop, Christopher M, and Minka, Thomas P. Principled hybrids of gener-\n\native and discriminative models. In CVPR, 2006.\n\n[18] Lee, Kibok, Lee, Kimin, Min, Kyle, Zhang, Yuting, Shin, Jinwoo, and Lee, Honglak. Hierar-\n\nchical novelty detection for visual object recognition. In CVPR, 2018.\n\n[19] Lee, Kimin, Hwang, Changho, Park, KyoungSoo, and Shin, Jinwoo. Con\ufb01dent multiple choice\n\nlearning. In ICML, 2017.\n\n[20] Lee, Kimin, Lee, Honglak, Lee, Kibok, and Shin, Jinwoo. Training con\ufb01dence-calibrated\n\nclassi\ufb01ers for detecting out-of-distribution samples. In ICLR, 2018.\n\n[21] Liang, Shiyu, Li, Yixuan, and Srikant, R. Principled detection of out-of-distribution examples\n\nin neural networks. In ICLR, 2018.\n\n10\n\n\f[22] Ma, Xingjun, Li, Bo, Wang, Yisen, Erfani, Sarah M, Wijewickrema, Sudanthi, Houle,\nMichael E, Schoenebeck, Grant, Song, Dawn, and Bailey, James. Characterizing adversar-\nial subspaces using local intrinsic dimensionality. In ICLR, 2018.\n\n[23] Maaten, Laurens van der and Hinton, Geoffrey. Visualizing data using t-sne. Journal of\n\nmachine learning research, 2008.\n\n[24] McCloskey, Michael and Cohen, Neal J. Catastrophic interference in connectionist networks:\n\nThe sequential learning problem. In Psychology of learning and motivation. Elsevier, 1989.\n\n[25] Mensink, Thomas, Verbeek, Jakob, Perronnin, Florent, and Csurka, Gabriela. Distance-based\nIEEE transactions on\n\nimage classi\ufb01cation: Generalizing to new classes at near-zero cost.\npattern analysis and machine intelligence, 2013.\n\n[26] Moosavi Dezfooli, Seyed Mohsen, Fawzi, Alhussein, and Frossard, Pascal. Deepfool: a simple\n\nand accurate method to fool deep neural networks. In CVPR, 2016.\n\n[27] Murphy, Kevin P. Machine learning: a probabilistic perspective. 2012.\n[28] Netzer, Yuval, Wang, Tao, Coates, Adam, Bissacco, Alessandro, Wu, Bo, and Ng, Andrew Y.\nReading digits in natural images with unsupervised feature learning. In NIPS workshop, 2011.\n[29] Rebuf\ufb01, Sylvestre-Alvise and Kolesnikov, Alexander. icarl: Incremental classi\ufb01er and repre-\n\nsentation learning. In CVPR, 2017.\n\n[30] Sharif, Mahmood, Bhagavatula, Sruti, Bauer, Lujo, and Reiter, Michael K. Accessorize to a\ncrime: Real and stealthy attacks on state-of-the-art face recognition. In ACM SIGSAC, 2016.\n[31] Vinyals, Oriol, Blundell, Charles, Lillicrap, Tim, Wierstra, Daan, et al. Matching networks for\n\none shot learning. In NIPS, 2016.\n\n[32] Yu, Fisher, Seff, Ari, Zhang, Yinda, Song, Shuran, Funkhouser, Thomas, and Xiao, Jianxiong.\nLsun: Construction of a large-scale image dataset using deep learning with humans in the loop.\narXiv preprint arXiv:1506.03365, 2015.\n\n11\n\n\f", "award": [], "sourceid": 3551, "authors": [{"given_name": "Kimin", "family_name": "Lee", "institution": "Korea Advanced Institute of Science and Technology"}, {"given_name": "Kibok", "family_name": "Lee", "institution": "University of Michigan"}, {"given_name": "Honglak", "family_name": "Lee", "institution": "Google / U. Michigan"}, {"given_name": "Jinwoo", "family_name": "Shin", "institution": "KAIST; AITRICS"}]}