{"title": "Feature Cross-Substitution in Adversarial Classification", "book": "Advances in Neural Information Processing Systems", "page_first": 2087, "page_last": 2095, "abstract": "The success of machine learning, particularly in supervised settings, has led to numerous attempts to apply it in adversarial settings such as spam and malware detection. The core challenge in this class of applications is that adversaries are not static data generators, but make a deliberate effort to evade the classifiers deployed to detect them. We investigate both the problem of modeling the objectives of such adversaries, as well as the algorithmic problem of accounting for rational, objective-driven adversaries. In particular, we demonstrate severe shortcomings of feature reduction in adversarial settings using several natural adversarial objective functions, an observation that is particularly pronounced when the adversary is able to substitute across similar features (for example, replace words with synonyms or replace letters in words). We offer a simple heuristic method for making learning more robust to feature cross-substitution attacks. We then present a more general approach based on mixed-integer linear programming with constraint generation, which implicitly trades off overfitting and feature selection in an adversarial setting using a sparse regularizer along with an evasion model. Our approach is the first method for combining an adversarial classification algorithm with a very general class of models of adversarial classifier evasion. We show that our algorithmic approach significantly outperforms state-of-the-art alternatives.", "full_text": "Feature Cross-Substitution in Adversarial\n\nClassi\ufb01cation\n\nBo Li and Yevgeniy Vorobeychik\n\nElectrical Engineering and Computer Science\n{bo.li.2,yevgeniy.vorobeychik}@vanderbilt.edu\n\nVanderbilt University\n\nAbstract\n\nThe success of machine learning, particularly in supervised settings, has led to\nnumerous attempts to apply it in adversarial settings such as spam and malware\ndetection. The core challenge in this class of applications is that adversaries are\nnot static data generators, but make a deliberate effort to evade the classi\ufb01ers de-\nployed to detect them. We investigate both the problem of modeling the objectives\nof such adversaries, as well as the algorithmic problem of accounting for rational,\nobjective-driven adversaries. In particular, we demonstrate severe shortcomings\nof feature reduction in adversarial settings using several natural adversarial objec-\ntive functions, an observation that is particularly pronounced when the adversary\nis able to substitute across similar features (for example, replace words with syn-\nonyms or replace letters in words). We offer a simple heuristic method for mak-\ning learning more robust to feature cross-substitution attacks. We then present\na more general approach based on mixed-integer linear programming with con-\nstraint generation, which implicitly trades off over\ufb01tting and feature selection in\nan adversarial setting using a sparse regularizer along with an evasion model. Our\napproach is the \ufb01rst method for combining an adversarial classi\ufb01cation algorithm\nwith a very general class of models of adversarial classi\ufb01er evasion. We show that\nour algorithmic approach signi\ufb01cantly outperforms state-of-the-art alternatives.\n\n1\n\nIntroduction\n\nThe success of machine learning has led to its widespread use as a workhorse in a wide variety of\ndomains, from text and language recognition to trading agent design. It has also made signi\ufb01cant\ninroads into security applications, such as fraud detection, computer intrusion detection, and web\nsearch [1, 2]. The use of machine (classi\ufb01cation) learning in security settings has especially piqued\nthe interest of the research community in recent years because traditional learning algorithms are\nhighly susceptible to a number of attacks [3, 4, 5, 6, 7]. The class of attacks that is of interest to us\nare evasion attacks, in which an intelligent adversary attempts to adjust their behavior so as to evade\na classi\ufb01er that is expressly designed to detect it [3, 8, 9].\nMachine learning has been an especially important tool for \ufb01ltering spam and phishing email, which\nwe treat henceforth as our canonical motivating domain. To date, there has been extensive research\ninvestigating spam and phish detection strategies using machine learning, most without considering\nadversarial modi\ufb01cation [10, 11, 12]. Failing to consider an adversary, however, exposes spam and\nphishing detection systems to evasion attacks. Typically, the predicament of adversarial evasion is\ndealt with by repeatedly re-learning the classi\ufb01er. This is a weak solution, however, since evasion\ntends to be rather quick, and re-learning is a costly task, since it requires one to label a large number\nof instances (in crowdsourced labeling, one also exposes the system to deliberate corruption of the\ntraining data). Therefore, several efforts have focused on proactive approaches of modeling the\n\n1\n\n\flearner and adversary as players in a game in which the learner chooses a classi\ufb01er or a learning\nalgorithm, and the attacker modi\ufb01es either the training or test data [13, 14, 15, 16, 8, 17, 18].\nSpam and phish detection, like many classi\ufb01cation domains, tends to suffer from the curse of dimen-\nsionality [11]. Feature reduction is therefore standard practice, either explicitly, by pruning features\nwhich lack suf\ufb01cient discriminating power, implicitly, by using regularization, or both [19]. One\nof our key novel insights is that in adversarial tasks, feature selection can open the door for the\nadversary to evade the classi\ufb01cation system. This metaphorical door is open particularly widely in\ncases where feature cross-substitution is viable. By feature cross-substitution, we mean that the ad-\nversary can accomplish essentially the same end by using one feature in place of another. Consider,\nfor example, a typical spam detection system using a \u201cbag-of-words\u201d feature vector. Words which\nin training data are highly indicative of spam can easily be substituted for by an adversary using\nsynonyms or through substituting characters within a word (such replacing an \u201co\u201d with a \u201c0\u201d). We\nsupport our insight through extensive experiments, exhibiting potential perils of traditional means\nfor feature selection. While our illustration of feature cross-substitution focuses on spam, we note\nthat the phenomenon is quite general. As another example, many Unix system commands have\nsubstitutes. For example, you can scan text using \u201cless\u201d, \u201cmore\u201d, \u201ccat\u201d, and you can copy \ufb01le1 to\n\ufb01le2 by \u201dcp \ufb01le1 \ufb01le2\u201d or \u201dcat \ufb01le1 > \ufb01le2\u201d. Thus, if one learns to detect malicious scripts without\naccounting for such equivalences, the resulting classi\ufb01er will be easy to evade.\nOur \ufb01rst proposed solution to the problem of feature reduction in adversarial classi\ufb01cation is\nequivalence-based learning, or constructing features based on feature equivalence classes, rather\nthan the underlying feature space. We show that this heuristic approach does, indeed, signi\ufb01cantly\nimprove resilience of classi\ufb01ers to adversarial evasion. Our second proposed solution is more prin-\ncipled, and takes the form of a general bi-level mixed integer linear program to solve a Stackelberg\ngame model of interactions between a learner and a collection of adversaries whose objectives are\ninferred from training data. The baseline formulation is quite intractable, and we offer two tech-\nniques for making it tractable: \ufb01rst, we cluster adversarial objectives, and second, we use constraint\ngeneration to iteratively converge upon a locally optimal solution. The principal merits of our pro-\nposed bi-level optimization approach over the state of the art are: a) it is able to capture a very\ngeneral class of adversary models, including the model proposed by Lowd and Meek [8], as well as\nour own which enables feature cross-substitution; in contrast, state-of-the-art approaches are specif-\nically tailored to their highly restrictive threat models; and b) it makes an implicit tradeoff between\nfeature selection through the use of sparse (l1) regularization and adversarial evasion (through the\nadversary model), thereby solving the problem of adversarial feature selection.\nIn summary, our contributions are:\n\n1. A new adversarial evasion model that explicitly accounts for the ability to cross-substitute\n\nfeatures (Section 3),\n\n2. an experimental demonstration of the perils of traditional feature selection (Section 4),\n\n3. a heuristic class-based learning approach (Section 5), and\n\n4. a bi-level optimization framework and solution methods that make a principled tradeoff\n\nbetween feature selection and adversarial evasion (Section 6).\n\n2 Problem de\ufb01nition\n\nThe Learner\nLet X \u2286 Rn be the feature space, with n the number of features. For a feature vector x \u2208 X, we let\nxi denote the ith feature. Suppose that the training set (x, y) is comprised of feature vectors x \u2208 X\ngenerated according to some unknown distribution x \u223c D, with y \u2208 {\u22121, +1} the corresponding\nbinary labels, where the meaning of \u22121 is that the instance x is benign, while +1 indicates a ma-\nlicious instance. The learner\u2019s task is to learn a classi\ufb01er g : X \u2192 {\u22121, +1} to label instances as\nmalicious or benign, using a training data set of labeled instances {(x1, y1), . . . , (xm, ym)}.\n\n2\n\n\fThe Adversary\nWe suppose that every instance x \u223c D corresponds to a \ufb01xed label y \u2208 {\u22121, +1}, where a label of\n+1 indicates that this instance x was generated by an adversary. In the context of a threat model,\ntherefore, we take this malicious x to be an expression of revealed preferences of the adversary:\nthat is, x is an \u201cideal\u201d instance that the adversary would generate if it were not marked as malicious\n(e.g., \ufb01ltered) by the classi\ufb01er. The core question is then what alternative instance, x(cid:48) \u2208 X, will be\ngenerated by the adversary. Clearly, x(cid:48) would need to evade the classi\ufb01er g, i.e., g(x(cid:48)) = \u22121. How-\never, this cannot be a suf\ufb01cient condition: after all, the adversary is trying to accomplish some goal.\nThis is where the ideal instance, which we denote xA comes in: we suppose that the ideal instance\nachieves the goal and consequently the adversary strives to limit deviations from it according to a\ncost function c(x(cid:48), xA). Therefore, the adversary aims to solve the following optimization problem:\n\nmin\n\nx(cid:48)\u2208X:g(x(cid:48))=\u22121\n\nc(x(cid:48), xA).\n\n(1)\n\nThere is, however, an additional caveat: the adversary typically only has query access to g(x), and\nqueries are costly (they correspond to actual batches of emails being sent out, for example). Thus, we\nassume that the adversary has a \ufb01xed query budget, Bq. Additionally, we assume that the adversary\nalso has a cost budget, Bc so that if the solution to the optimization problem (1) found after making\nBq queries falls above the cost budget, the adversary will use the ideal instance xA as x(cid:48), since\ndeviations fail to satisfy the adversary\u2019s main goals.\n\nThe Game\n\nThe game between the learner and the adversary proceeds as follows:\n\n1. The learner uses training data to choose a classi\ufb01er g(x).\n2. Each adversary corresponding to malicious feature vectors x uses a query-based algorithm\nto (approximately) solve the optimization problem (1) subject to the query and cost budget\nconstraints.\n3. The learner\u2019s \u201ctest\u201d error is measured using a new data set in which every malicious x \u2208 X\n\nis replaced with a corresponding x(cid:48) computed by the adversary in step 2.\n\n3 Modeling Feature Cross-Substitution\n\nDistance-Based Cost Functions\n\nIn one of the \ufb01rst adversarial classi\ufb01cation models, Lowd and Meek [8] proposed a natural l1\ndistance-based cost function which penalizes for deviations from the ideal feature vector xA:\n\n(cid:88)\n\nc(x(cid:48), xA) =\n\nai|x(cid:48)\n\ni \u2212 xA\ni |,\n\n(2)\n\nwhere ai is a relative importance of feature i to the adversary. All follow-up work in the adversarial\nclassi\ufb01cation domain has used either this cost function or variations [3, 4, 7, 20].\n\ni\n\nFeature Cross-Substitution Attacks\n\nWhile distance-based cost functions seem natural models of adversarial objective, they miss an im-\nportant phenomenon of feature cross-substitution. In spam or phishing, this phenomenon is most ob-\nvious when an adversary substitutes words for their synonyms or substitutes similar-looking letters\nin words. As an example, consider Figure 1 (left), where some features can naturally be substituted\nfor others without signi\ufb01cantly changing the original content. These words can contain features with\nthe similar meaning or effect (e.g. money and cash) or differ in only a few letters (e.g clearance and\nclaerance). The impact is that the adversary can achieve a much lower cost of transforming an ideal\ninstance xA using similarity-based feature substitutions than simple distance would admit.\nTo model feature cross-substitution attacks, we introduce for each feature i an equivalence class\nof features, Fi, which includes all admissible substitutions (e.g., k-letter word modi\ufb01cations or\n\n3\n\n\fFigure 1: Left: illustration of feature substitution attacks. Right: comparison between distance-\nbased and equivalence-based cost functions.\n\nsynonyms), and generalize (2) to account for such cross-feature equivalence:\n\n(cid:88)\n\ni\n\nc(x(cid:48), xA) =\n\nmin\nj \u2295x(cid:48)\nj\u2208Fi|xA\n\nj =1\n\nai|x(cid:48)\n\nj \u2212 xA\ni |,\n\n(3)\n\nj \u2295 x(cid:48)\n\nwhere \u2295 is the exclusive-or, so that xA\nj = 1 ensures that we only substitute between different\nfeatures rather than simply adding features. Figure 1 (right) shows the cost comparison between\nthe Lowd and Meek and equivalence-based cost functions under letter substitution attacks based on\nEnron email data [21], with the attacker simulated by running a variation of the Lowd and Meek\nalgorithm (see the Supplement for details), given a speci\ufb01ed number of features (see Section 4 for\nthe details about how we choose the features). The key observation is that the equivalence-based\ncost function signi\ufb01cantly reduces attack costs compared to the distance-based cost function, with\nthe difference increasing in the size of the equivalence class. The practical import of this observation\nis that the adversary will far more frequently come under cost budget when he is able to use such\nsubstitution attacks. Failure to capture this phenomenon therefore results in a threat model that\nsigni\ufb01cantly underestimates the adversary\u2019s ability to evade a classi\ufb01er.\n\n4 The Perils of Feature Reduction in Adversarial Classi\ufb01cation\n\nFeature reduction is one of the fundamental tasks in machine learning aimed at controlling over-\n\ufb01tting. The insight behind feature reduction in traditional machine learning is that there are two\nsources of classi\ufb01cation error: bias, or the inherent limitation in expressiveness of the hypothesis\nclass, and variance, or inability of a classi\ufb01er to make accurate generalizations because of over-\n\ufb01tting the training data. We now observe that in adversarial classi\ufb01cation, there is a crucial third\nsource of generalization error, introduced by adversarial evasion. Our main contribution in this sec-\ntion is to document the tradeoff between feature reduction and the ability of the adversary to evade\nthe classi\ufb01er and thereby introduce this third kind of generalization error. In addition, we show the\nimportant role that feature cross-substitution can play in this phenomenon.\nTo quantify the perils of feature reduction in adversarial classi\ufb01cation, we \ufb01rst train each classi\ufb01er\nusing a different number of features n.\nIn order to draw a uniform comparison across learning\nalgorithms and cost functions, we used an algorithm-independent means to select a subset of features\ngiven a \ufb01xed feature budget n. Speci\ufb01cally, we select the set of features in each case based on a\nscore function score(i) = |F R\u22121(i) \u2212 F R+1(i)|, where F RC(i) represents the frequency that a\nfeature i appears in instances x in class C \u2208 {\u22121, +1}. We then sort all the features i according to\nscore and select a subset of n highest ranked features. Finally, we simulate an adversary as running\nan algorithm which is a generalization of the one proposed by Lowd and Meek [8] to support our\nproposed equivalence-based cost function (see the Supplement, Section 2, for details).\nOur evaluation uses three data sets: Enron email data [21], Ling-spam data [22], and internet\nadvertisement dataset from the UCI repository [23]. The Enron data set was divided into training set\nof 3172 and a test set of 2000 emails in each of 5 folds of cross-validation, with an equal number of\nspam and non-spam instances [21]. A total of 3000 features were chosen for the complete feature\npool, and we sub-selected between 5 and 1000 of these features for our experiments. The Ling-spam\ndata set was divided into 1158 instances for training and 289 for test in cross-validation with \ufb01ve\n\n4\n\n\ftimes as much non-spam as spam, and contains 1000 features from which between 5 and 500 were\nsub-selected for the experiments. Finally, the UCI data set was divided into 476 training and 119 test\ninstances in \ufb01ve-fold cross validation, with four times as many advertisement as non-advertisement\ninstances. This data set contains 200 features, of which between 5 and 200 were chosen. For each\ndata set, we compared the effect of adversarial evasion on the performance of four classi\ufb01cation\nalgorithms: Naive Bayes, SVM with linear and rbf kernels, and neural network classi\ufb01ers.\n\n(a)\n\n(b)\n\n(c)\n\n(d)\n\nFigure 2: Effect of adversarial evasion on feature reduction strategies. (a)-(d) deterministic Naive\nBayes classi\ufb01er, SVM with linear kernel, SVM with rbf kernel, and Neural network, respectively.\nTop sets of \ufb01gures correspond to distance-based and bottom \ufb01gures are equivalence-based cost func-\ntions, where equivalence classes are formed using max-2-letter substitutions.\n\nThe results of Enron data are documented in Figure 2; the others are shown in the Supplement.\nConsider the lowest (purple) lines in all plots, which show cross-validation error as a function of\nthe number of features used, as the baseline comparison. Typically, there is an \u201coptimal\u201d number\nof features (the small circle), i.e., the point at which the cross-validation error rate \ufb01rst reaches a\nminimum, and traditional machine learning methods will strive to select the number of features near\nthis point. The \ufb01rst key observation is that whether the adversary uses the distance- or equivalence-\nbased cost functions, there tends to be a shift of this \u201coptimal\u201d point to the right (the large circle):\nthe learner should be using more features when facing a threat of adversarial evasion, despite the\npotential risk of over\ufb01tting. The second observation is that when a signi\ufb01cant amount of malicious\ntraf\ufb01c is present, evasion can account for a dominant portion of the test error, shifting the error\nup signi\ufb01cantly. Third, feature cross-substitution attacks can make this error shift more dramatic,\nparticularly as we increase the size of the equivalence class (as documented in the Supplement).\n\n5 Equivalence-Based Classi\ufb01cation\n\nHaving documented the problems associated with feature reduction in adversarial classi\ufb01cation, we\nnow offer a simple heuristic solution: equivalence-based classi\ufb01cation (EBC). The idea behind EBC\nis that instead of using underlying features for learning and classi\ufb01cation, we use equivalence classes\nin their place. Speci\ufb01cally, we partition features into equivalence classes. Then, for each equivalence\nclass, we create a corresponding meta-feature to be used in learning. For example, if the underlying\nfeatures are binary and indicating a presence of a particular word in an email, the equivalence-class\nmeta-feature would be an indicator that some member of the class is present in the email. As another\nexample, when features represent frequencies of word occurrences, meta-features could represent\naggregate frequencies of features in the corresponding equivalence class.\n\n6 Stackelberg Game Multi-Adversary Model\n\nThe proposed equivalence-based classi\ufb01cation method is a highly heuristic solution to the issue of\nadversarial feature reduction. We now offer a more principled and general approach to adversarial\n\n5\n\n\fmin\n\n\u03b1\n\nw\n\nj|yj =\u22121\n\nj|yj =1\n\nclassi\ufb01cation based on the game model described in Section 2. Formally, we aim to compute a\nStackelberg equilibrium of the game in which the learner moves \ufb01rst by choosing a linear classi\ufb01er\ng(x) = wT x and all the attackers simultaneously and independently respond to g by choosing x(cid:48)\naccording to a query-based algorithm optimizing the cost function c(x(cid:48), xA) subject to query and\ncost budget constraints. Consequently, we term this approach Stackelberg game multi-adversary\nmodel (SMA). The optimization problem for the learner then takes the following form:\nl(wT F (xj; w)) + \u03bb||w||1,\n\nl(\u2212wT xj) + (1 \u2212 \u03b1)\n\n(cid:88)\n\n(cid:88)\n\n(4)\n\nwhere l(\u00b7) is the hinge loss function and \u03b1 \u2208 [0, 1] trades off between the importance of false\npositives and false negatives. Note the addition of l1 regularizer to make an explicit tradeoff between\nover\ufb01tting and resilience to adversarial evasion. Here, F (xj; w) generically captures the adversarial\ndecision model. In our setting, the adversary uses a query-based algorithm (which is an extension\nof the algorithm proposed by Lowd and Meek [8]) to approximately minimize cost c(x(cid:48), xj) over\nx(cid:48) : wT x(cid:48) \u2264 0, subject to budget constraints on cost and the number of queries. In order to solve\nthe optimization problem (4) we now describe how to formulate it as a (very large) mixed-integer\nlinear program (MILP), and then propose several heuristic methods for making it tractable. Since\nadversaries here correspond to feature vectors xj which are malicious (and which we interpret as the\n\u201cideal\u201d instances xA of these adversaries), we henceforth refer to a given adversary by the index j.\nThe \ufb01rst step is to observe that the hinge loss function and (cid:107)w(cid:107)1 can both be easily linearized using\nstandard methods. We therefore focus on the more challenging task of expressing the adversarial\ndecision in response to a classi\ufb01cation choice w as a collection of linear constraints.\nTo begin, let \u00afX be the set of all feature vectors that an adversary can compute using a \ufb01xed query\nbudget (this is just a conceptual tool; we will not need to know this set in practice, as shown below).\nThe adversary\u2019s optimization problem can then be described as computing\n\nzj = arg min\n\nx(cid:48)\u2208 \u00afX|wT x(cid:48)\u22640\n\nc(x(cid:48), xj)\n\nwhen the minimum is below the cost budget, and setting zj = xj otherwise. Now de\ufb01ne an auxiliary\nmatrix T in which each column corresponds to a particular attack feature vector x(cid:48), which we index\nusing variables a; thus Tia corresponds to the value of feature i in attack feature vector with index a.\nDe\ufb01ne another auxiliary binary matrix L where Laj = 1 iff the strategy a satis\ufb01es the budget con-\nstraint for the attacker j. Next, de\ufb01ne a matrix c where caj is the cost of the strategy a to adversary\nj (computed using an arbitrary cost function; we can use either the distance- or equivalence-based\n(cid:80)\ncost functions, for example). Finally, let zaj be a binary variable that selects exactly one feature\nvector a for the adversary j. First, we must have a constraint that zaj = 1 for exactly one strategy a:\na zaj = 1 \u2200 j. Now, suppose that the strategy a that is selected is the best available option for the\nusing ej =(cid:80)\nattacker j; it may be below the cost budget, in which case this is the strategy used by the adversary,\nor above budget, in which case xj is used. We can calculate the resulting value of wT F (xj; w)\na zajwT (LajTa + (1\u2212 Laj)xj). This expression introduces bilinear terms zajwT , but\n(cid:80)\nsince zaj are binary these terms can be linearized using McCormick inequalities [24]. To ensure that\nzja selects the strategy which minimizes cost among all feasible options, we introduce constraints\na zajcaj \u2264 ca(cid:48)j + M (1\u2212 ra(cid:48)), where M is a large constant and ra(cid:48) is an indicator variable which\nis 1 iff wT Ta(cid:48) \u2264 0 (that is, if a(cid:48) is classi\ufb01ed as benign); the corresponding term ensures that the\nconstraint is non-trivial only for a(cid:48) which are classi\ufb01ed benign. Finally, we calculate ra for all a\nusing constraints (1 \u2212 2ra)wT Ta \u2264 0. While this constraint again introduces bilinear terms, these\ncan be linearized as well since ra are binary. The full MILP formulation is shown in Figure 3 (left).\nAs is, the resulting MILP is intractable for two reasons: \ufb01rst, the best response must be computed\n(using a set of constraints above) for each adversary j, of which there could be many, and second,\nwe need a set of constraints for each feasible attack action (feature vector) x \u2208 \u00afX (which we index\nby a). We tackle the \ufb01rst problem by clustering the \u201cideal\u201d attack vectors xj into a set of 100 clusters\nand using the mean of each cluster as xA for the representative attacker. This dramatically reduces\nthe number of adversaries and, therefore, the size of the problem. To tackle the second problem\nwe use constraint generation to iteratively add strategies a into the above program by executing the\nLowd and Meek algorithm in each iteration in response to the classi\ufb01er w computed in previous\niteration. In combination, these techniques allow us to scale the proposed optimization method to\nrealistic problem instances. The full SMA algorithm is shown in Figure 3 (right).\n\n6\n\n\fDi + (1 \u2212 \u03b1)\n\nSi + \u03bb\n\ni|yi=0\n\ni|yi=1\ns.t. : \u2200a, i, j : zi(a), r(a) \u2208 {0, 1}\n\n(cid:88)\n\n(cid:88)\n\nj\n\nKj\n\nAlgorithm 1 SMA(X)\n\nT =randStrats() // initial set of attacks\nX(cid:48) \u2190 cluster(X)\nw0 \u2190 MILP(X(cid:48), T )\nw \u2190 w0\nwhile T changes do\n\nspam do\n\nfor xA \u2208 X(cid:48)\nt =computeAttack(xA, w)\nT \u2190 T \u222a t\n\nend for\nw \u2190 MILP(X(cid:48), T )\n\nend while\nreturn w\n\nmin\nw,z,r\n\n\u03b1\n\n(cid:88)\n\na\n\nzi(a) = 1\n\n(cid:88)\n\n(cid:88)\n\na\n\nmi(a)(LaiTa + (1 \u2212 Lai)xi)\n\n\u2200i : ei =\n\u2200a, i, j : \u2212M zi(a) \u2264 mij (a) \u2264 M zi(a)\n\u2200a, i, j : wj \u2212 M (1 \u2212 zi(a)) \u2264 mij (a) \u2264 wj + M (1 \u2212 zi(a))\n\u2200a :\n\nwj Taj \u2264 2\n\n(cid:88)\n\n(cid:88)\n\nTaj yaj\n\nj\n\nj\n\n\u2200a, j : \u2212M ra \u2264 yaj \u2264 M ra\n\u2200a, j : wj \u2212 M (1 \u2212 ra) \u2264 yaj \u2264 wj + M (1 \u2212 ra)\n\u2200i : Di = max(0, 1 \u2212 wT xi)\n\u2200i : Si = max(0, 1 + ei)\n\u2200j : Kj = max(wj , \u2212wj )\n\nFigure 3: Left: MILP to compute solution to (4). Right: SMA iterative algorithm using cluster-\ning and constraint generation. The matrices L and C in the MILP can be pre-computed using the\nmatrix of strategies and corresponding indices T in each iteration, as well as the cost budget Bc.\ncomputeAttack() is the attacker\u2019s best response (see the Supplement for details).\n\n7 Experiments\n\nIn this section we investigate the effectiveness of the two proposed methods: the equivalence-based\nclassi\ufb01cation heuristic (EBC) and the Stackelberg game multi-adversary model (SMA) solved using\nmixed-integer linear programming. As in Section 4, we consider three data sets: the Enron data,\nLing-spam data, and UCI data. We draw a comparison to three baselines: 1) \u201ctraditional\u201d machine\nlearning algorithms (we report the results for SVM; comparisons to Naive Bayes and Neural Net-\nwork classi\ufb01ers are provided in the Supplement, Section 3), 2) Stackelberg prediction game (SPG)\nalgorithm with linear loss [17], and 3) SPG with logistic loss [17]. Both (2) and (3) are state-of-the-\nart alternative methods developed speci\ufb01cally for adversarial classi\ufb01cation problems.\nOur \ufb01rst set of results (Figure 4) is a performance comparison of our proposed methods to the three\nbaselines, evaluated using an adversary striving to evade the classi\ufb01er, subject to query and cost\nbudget constraints. For the Enron data, we can see, remarkably, that the equivalence-based classi\ufb01er\n\n(a)\n\n(b)\n\n(c)\n\nFigure 4: Comparison of EBC and SMA approaches to baseline alternatives on Enron data (a),\nLing-spam data (b), and UCI data(c). Top: Bc = 5, Bq = 5. Bottom: Bc = 20, Bq = 10.\n\n7\n\n\foften signi\ufb01cantly outperforms both SPG with linear and logistic loss. On the other hand, the perfor-\nmance of EBC is relatively poor on Ling-spam data, although observe that even the traditional SVM\nclassi\ufb01er has a reasonably low error rate in this case. While the performance of EBC is clearly data-\ndependent, SMA (purple lines in Figure 4) exhibits dramatic performance improvement compared\nto alternatives in all instances (see the Supplement, Section 3, for extensive additional experiments,\nincluding comparisons to other classi\ufb01ers, and varying adversary\u2019s budget constraints).\nFigure 5 (left) looks deeper at the nature of SMA solution vectors w. Speci\ufb01cally, we consider\nhow the adversary\u2019s strength, as measured by the query budget, affects the sparsity of solutions\nas measured by (cid:107)w(cid:107)0. We can see a clear trend: as the adversary\u2019s budget increases, solutions\nbecome less sparse (only the result for Ling data is shown, but the same trend is observed for other\ndata sets; see the Supplement, Section 3, for details). This is to be expected in the context of\nour investigation of the impact that adversarial evasion has on feature reduction (Section 4): SMA\nautomatically accounts for the tradeoff between resilience to adversarial evasion and regularization.\nFinally, Figure 5 (middle, right) considers the impact of the number of clusters used in solving the\n\nFigure 5: Left: (cid:107)w(cid:107)0 of the SMA solution for Ling data. Middle: SMA error rates, and Right: SMA\nrunning time, as a function of the number of clusters used.\n\nSMA problem on running time and error. The key observation is that with relatively few (80-100)\nclusters we can achieve near-optimal performance, with signi\ufb01cant savings in running time.\n\n8 Conclusions\n\nWe investigated two phenomena in the context of adversarial classi\ufb01cation settings: classi\ufb01er eva-\nsion and feature reduction, exhibiting strong tension between these. The tension is surprising: fea-\nture/dimensionality reduction is a hallmark of practical machine learning, and, indeed, is generally\nviewed as increasing classi\ufb01er robustness. Our insight, however, is that feature selection will typi-\ncally provide more room for the intelligent adversary to choose features not used in classi\ufb01cation,\nbut providing a near-equivalent alternative to their \u201cideal\u201d attacks which would otherwise be de-\ntected. Terming this idea feature cross-substitution (i.e., the ability of the adversary to effectively\nuse different features to achieve the same goal), we offer extensive experimental evidence that ag-\ngressive feature reduction does, indeed, weaken classi\ufb01cation ef\ufb01cacy in adversarial settings. We\noffer two solutions to this problem. The \ufb01rst is highly heuristic, using meta-features constructed\nusing feature equivalence classes for classi\ufb01cation. The second is a principled and general Stackel-\nberg game multi-adversary model (SMA), solved using mixed-integer linear programming. We use\nexperiments to demonstrate that the \ufb01rst solution often outperforms state-of-the-art adversarial clas-\nsi\ufb01cation methods, while SMA is signi\ufb01cantly better than all alternatives in all evaluated cases. We\nalso show that SMA in fact implicitly makes a tradeoff between feature reduction and adversarial\nevasion, with more features used in the context of stronger adversaries.\n\nAcknowledgments\n\nThis research was partially supported by Sandia National Laboratories. Sandia National Laborato-\nries is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned\nsubsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy\u2019s National Nuclear\nSecurity Administration under contract DE-AC04-94AL85000.\n\n8\n\n\fReferences\n[1] Tom Fawcett and Foster Provost. Adaptive fraud detection. Data mining and knowledge discovery,\n\n1(3):291\u2013316, 1997.\n\n[2] Matthew V Mahoney and Philip K Chan. Learning nonstationary models of normal network traf\ufb01c for de-\ntecting novel attacks. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge\ndiscovery and data mining, pages 376\u2013385. ACM, 2002.\n\n[3] Marco Barreno, Blaine Nelson, Anthony D Joseph, and JD Tygar. The security of machine learning.\n\nMachine Learning, 81(2):121\u2013148, 2010.\n\n[4] Marco Barreno, Peter L Bartlett, Fuching Jack Chi, Anthony D Joseph, Blaine Nelson, Benjamin IP\nRubinstein, Udam Saini, and J Doug Tygar. Open problems in the security of learning. In Proceedings of\nthe 1st ACM workshop on Workshop on AISec, pages 19\u201326. ACM, 2008.\n\n[5] Battista Biggio, Giorgio Fumera, and Fabio Roli. Security evaluation of pattern classi\ufb01ers under attack.\n\nIEEE Transactions on Data and Knowledge Engineering, 26(4):984\u2013996, 2013.\n\n[6] Pavel Laskov and Richard Lippmann. Machine learning in adversarial environments. Machine learning,\n\n81(2):115\u2013119, 2010.\n\n[7] Blaine Nelson, Benjamin IP Rubinstein, Ling Huang, Anthony D Joseph, and JD Tygar. Classi\ufb01er evasion:\nModels and open problems. In Privacy and Security Issues in Data Mining and Machine Learning, pages\n92\u201398. Springer, 2011.\n\n[8] Daniel Lowd and Christopher Meek. Adversarial learning. In Proceedings of the eleventh ACM SIGKDD\n\ninternational conference on Knowledge discovery in data mining, pages 641\u2013647. ACM, 2005.\n\n[9] Christoph Karlberger, G\u00a8unther Bayler, Christopher Kruegel, and Engin Kirda. Exploiting redundancy in\n\nnatural language to penetrate bayesian spam \ufb01lters. WOOT, 7:1\u20137, 2007.\n\n[10] Mehran Sahami, Susan Dumais, David Heckerman, and Eric Horvitz. A bayesian approach to \ufb01ltering\njunk e-mail. In Learning for Text Categorization: Papers from the 1998 workshop, volume 62, pages\n98\u2013105, 1998.\n\n[11] KONG Ying and ZHAO Jie. Learning to \ufb01lter unsolicited commercial e-mail. International Proceedings\n\nof Computer Science & Information Technology, 49, 2012.\n\n[12] Vangelis Metsis, Ion Androutsopoulos, and Georgios Paliouras. Spam \ufb01ltering with naive bayes-which\n\nnaive bayes? In CEAS, pages 27\u201328, 2006.\n\n[13] Nilesh Dalvi, Pedro Domingos, Sumit Sanghai, Deepak Verma, et al. Adversarial classi\ufb01cation. In Pro-\nceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining,\npages 99\u2013108. ACM, 2004.\n\n[14] Laurent El Ghaoui, Gert Ren\u00b4e Georges Lanckriet, Georges Natsoulis, et al. Robust classi\ufb01cation with\n\ninterval data. Computer Science Division, University of California, 2003.\n\n[15] Wei Liu and Sanjay Chawla. A game theoretical model for adversarial learning. In Data Mining Work-\n\nshops, 2009. ICDMW\u201909. IEEE International Conference on, pages 25\u201330. IEEE, 2009.\n\n[16] Tom Fawcett. In vivo spam \ufb01ltering: a challenge problem for kdd. ACM SIGKDD Explorations Newslet-\n\nter, 5(2):140\u2013148, 2003.\n\n[17] Michael Br\u00a8uckner and Tobias Scheffer. Stackelberg games for adversarial prediction problems. In Pro-\nceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining,\npages 547\u2013555. ACM, 2011.\n\n[18] Ion Androutsopoulos, Evangelos F Magirou, and Dimitrios K Vassilakis. A game theoretic model of spam\n\ne-mailing. In CEAS, 2005.\n\n[19] Tiago A Almeida, Akebo Yamakami, and Jurandy Almeida. Evaluation of approaches for dimensional-\nity reduction applied with naive bayes anti-spam \ufb01lters. In Machine Learning and Applications, 2009.\nICMLA\u201909. International Conference on, pages 517\u2013522. IEEE, 2009.\n\n[20] B. Nelson, B. Rubinstein, L. Huang, A. Joseph, S. Lee, S. Rao, and J. D. Tygar. Query strategies for\n\nevading convex-inducing classi\ufb01ers. Journal of Machine Learning Research, 13:1293\u20131332, 2012.\n\n[21] Bryan Klimt and Yiming Yang. The enron corpus: A new dataset for email classi\ufb01cation research. In\n\nMachine learning: ECML 2004, pages 217\u2013226. Springer, 2004.\n\n[22] Ion Androutsopoulos, John Koutsias, Konstantinos V Chandrinos, George Paliouras, and Constantine D\n\nSpyropoulos. An evaluation of naive bayesian anti-spam \ufb01ltering. arXiv preprint cs/0006013, 2000.\n\n[23] K. Bache and M. Lichman. UCI machine learning repository, 2013.\n[24] Garth P McCormick. Computability of global solutions to factorable nonconvex programs: Part iconvex\n\nunderestimating problems. Mathematical programming, 10(1):147\u2013175, 1976.\n\n9\n\n\f", "award": [], "sourceid": 1119, "authors": [{"given_name": "Bo", "family_name": "Li", "institution": "Vanderbilt University"}, {"given_name": "Yevgeniy", "family_name": "Vorobeychik", "institution": "Vanderbilt University"}]}