{"title": "A Unified View of Piecewise Linear Neural Network Verification", "book": "Advances in Neural Information Processing Systems", "page_first": 4790, "page_last": 4799, "abstract": "The success of Deep Learning and its potential use in many safety-critical\n applications has motivated research on formal verification of Neural Network\n (NN) models. Despite the reputation of learned NN models to behave as black\n boxes and the theoretical hardness of proving their properties, researchers\n have been successful in verifying some classes of models by exploiting their\n piecewise linear structure and taking insights from formal methods such as\n Satisifiability Modulo Theory. These methods are however still far from\n scaling to realistic neural networks. To facilitate progress on this crucial\n area, we make two key contributions. First, we present a unified framework\n that encompasses previous methods. This analysis results in the identification\n of new methods that combine the strengths of multiple existing approaches,\n accomplishing a speedup of two orders of magnitude compared to the previous\n state of the art. Second, we propose a new data set of benchmarks which\n includes a collection of previously released testcases. We use the benchmark\n to provide the first experimental comparison of existing algorithms and\n identify the factors impacting the hardness of verification problems.", "full_text": "A Uni\ufb01ed View of Piecewise Linear Neural Network Veri\ufb01cation\n\nRudy Bunel\n\nUniversity of Oxford\nrudy@robots.ox.ac.uk\n\nIlker Turkaslan\n\nUniversity of Oxford\n\nilker.turkaslan@lmh.ox.ac.uk\n\nPhilip H.S. Torr\n\nUniversity of Oxford\n\nphilip.torr@eng.ox.ac.uk\n\nPushmeet Kohli\n\nDeepmind\n\npushmeet@google.com\n\nM. Pawan Kumar\nUniversity of Oxford\nAlan Turing Institute\n\npawan@robots.ox.ac.uk\n\nAbstract\n\nThe success of Deep Learning and its potential use in many safety-critical applica-\ntions has motivated research on formal veri\ufb01cation of Neural Network (NN) models.\nDespite the reputation of learned NN models to behave as black boxes and the\ntheoretical hardness of proving their properties, researchers have been successful in\nverifying some classes of models by exploiting their piecewise linear structure and\ntaking insights from formal methods such as Satisi\ufb01ability Modulo Theory. These\nmethods are however still far from scaling to realistic neural networks. To facilitate\nprogress on this crucial area, we make two key contributions. First, we present a\nuni\ufb01ed framework that encompasses previous methods. This analysis results in\nthe identi\ufb01cation of new methods that combine the strengths of multiple existing\napproaches, accomplishing a speedup of two orders of magnitude compared to the\nprevious state of the art. Second, we propose a new data set of benchmarks which\nincludes a collection of previously released testcases. We use the benchmark to\nprovide the \ufb01rst experimental comparison of existing algorithms and identify the\nfactors impacting the hardness of veri\ufb01cation problems.\n\nIntroduction\n\n1\nDespite their success in a wide variety of applications, Deep Neural Networks have seen limited\nadoption in safety-critical settings. The main explanation for this lies in their reputation for being\nblack-boxes whose behaviour can not be predicted. Current approaches to evaluate trained models\nmostly rely on testing using held-out data sets. However, as Edsger W. Dijkstra said [3], \u201ctesting shows\nthe presence, not the absence of bugs\u201d. If deep learning models are to be deployed to applications\nsuch as autonomous driving cars, we need to be able to verify safety-critical behaviours.\nTo this end, some researchers have tried to use formal methods. To the best of our knowledge,\nZakrzewski [20] was the \ufb01rst to propose a method to verify simple, one hidden layer neural networks.\nHowever, only recently were researchers able to work with non-trivial models by taking advantage of\nthe structure of ReLU-based networks [4, 11]. Even then, these works are not scalable to the large\nnetworks encountered in most real world problems.\nThis paper advances the \ufb01eld of NN veri\ufb01cation by making the following key contributions:\n\n1. We reframe state of the art veri\ufb01cation methods as special cases of Branch-and-Bound\n\noptimization, which provides us with a uni\ufb01ed framework to compare them.\n\n2. We gather a data set of test cases based on the existing literature and extend it with new\n\nbenchmarks. We provide the \ufb01rst experimental comparison of veri\ufb01cation methods.\n\n3. Based on this framework, we identify algorithmic improvements in the veri\ufb01cation process,\nspeci\ufb01cally in the way bounds are computed, the type of branching that are considered, as\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\fwell as the strategies guiding the branching. Compared to the previous state of the art, these\nimprovements lead to speed-up of almost two orders of magnitudes.\n\nSection 2 and 3 give the speci\ufb01cation of the problem and formalise the veri\ufb01cation process. Section 4\npresents our uni\ufb01ed framework, showing that previous methods are special cases and highlighting\npotential improvements. Section 5 presents our experimental setup and Section 6 analyses the results.\n2 Problem speci\ufb01cation\nWe now specify the problem of formal veri\ufb01cation of neural networks. Given a network that\nimplements a function \u02c6xn = f (x0), a bounded input domain C and a property P , we want to prove\n(1)\nthe property of robustness to adversarial examples in L\u221e norm around a\nFor example,\ntraining sample a with label ya would be encoded by using C (cid:44) {x0| (cid:107)x0 \u2212 a(cid:107)\u221e \u2264 \u0001} and\n\n\u02c6xn = f (x0) =\u21d2 P (\u02c6xn).\n\nx0 \u2208 C,\n\nP (\u02c6xn) =(cid:8)\u2200y\n\n(cid:9).\n\n\u02c6xn[ya] > \u02c6xn[y]\n\nIn this paper, we are going to focus on Piecewise-Linear Neural Networks (PL-NN), that is, networks\nfor which we can decompose C into a set of polyhedra Ci such that C = \u222ai Ci, and the restriction\nof f to Ci is a linear function for each i. While this prevents us from including networks that\nuse activation functions such as sigmoid or tanh, PL-NNs allow the use of linear transformations\nsuch as fully-connected or convolutional layers, pooling units such as MaxPooling and activation\nfunctions such as ReLUs. In other words, PL-NNs represent the majority of networks used in practice.\nOperations such as Batch-Normalization or Dropout also preserve piecewise linearity at test-time.\nThe properties that we are going to consider are Boolean formulas over linear inequalities. In our\nrobustness to adversarial example above, the property is a conjunction of linear inequalities, each of\nwhich constrains the output of the original label to be greater than the output of another label.\nThe scope of this paper does not include approaches relying on additional assumptions such as\ntwice differentiability of the network [8, 20], limitation of the activation to binary values [4, 15] or\nrestriction to a single linear domain [2]. Since they do not provide formal guarantees, we also don\u2019t\ninclude approximate approaches relying on a limited set of perturbation [10] or on over-approximation\nmethods that potentially lead to undecidable properties [16, 19].\n3 Veri\ufb01cation Formalism\n3.1 Veri\ufb01cation as a Satis\ufb01ability problem\nThe methods we involve in our comparison all leverage the piecewise-linear structure of PL-NN\nto make the problem more tractable. They all follow the same general principle: given a property\nto prove, they attempt to discover a counterexample that would make the property false. This is\naccomplished by de\ufb01ning a set of variables corresponding to the inputs, hidden units and output of\nthe network, and the set of constraints that a counterexample would satisfy.\nTo help design a uni\ufb01ed framework, we reduce all instances of veri\ufb01cation problems to a canonical\nrepresentation. Speci\ufb01cally, the whole satis\ufb01ability problem will be transformed into a global\noptimization problem where the decision will be obtained by checking the sign of the minimum.\nIf the property to verify is a simple inequality P (\u02c6xn) (cid:44) cT \u02c6xn > b, it is suf\ufb01cient to add to the\nnetwork a \ufb01nal fully connected layer with one output, with weight of c and a bias of \u2212b. If the\nglobal minimum of this network is positive, it indicates that for all \u02c6xn the original network can\noutput, we have cT \u02c6xn \u2212 b > 0 =\u21d2 cT \u02c6xn > b, and as a consequence the property is True. On the\nother hand, if the global minimum is negative, then the minimizer provides a counter-example. The\nsupplementary material shows that OR and AND clauses in the property can similarly be expressed as\nadditional layers, using MaxPooling units.\nWe can formulate any Boolean formula over linear inequalities on the output of the network\nas a sequence of additional linear and max-pooling layers. The veri\ufb01cation problem will be\nreduced to the problem of \ufb01nding whether the scalar output of the potentially modi\ufb01ed net-\nwork can reach a negative value. Assuming the network only contains ReLU activations be-\ntween each layer,\nthe satis\ufb01ability problem to \ufb01nd a counterexample can be expressed as:\nl0 \u2264 x0 \u2264 u0\n\u02c6xn \u2264 0\n\n\u2200i \u2208 {0, n \u2212 1}\n\u2200i \u2208 {1, n \u2212 1}.\n\n\u02c6xi+1 = Wi+1xi + bi+1\n\nxi = max (\u02c6xi, 0)\n\n(2c)\n(2d)\n\n(2a)\n(2b)\n\n2\n\n\fxi = max (\u02c6xi, 0) \u21d2 \u03b4i \u2208 {0, 1}hi, xi \u2265 0,\nxi \u2265 \u02c6xi,\n\nEq. (2a) represents the constraints on the input and Eq. (2b) on the neural network output. Eq. (2c)\nencodes the linear layers of the network and Eq. (2d) the ReLU activation functions. If an assignment\nto all the values can be found, this represents a counterexample. If this problem is unsatis\ufb01able, no\ncounterexample can exist, implying that the property is True. We emphasise that we are required to\nprove that no counter-examples can exist, and not simply that none could be found.\nWhile for clarity of explanation, we have limited ourselves to the speci\ufb01c case where only ReLU\nactivation functions are used, this is not restrictive. The supplementary material contains a section\ndetailing how each method speci\ufb01cally handles MaxPooling units, as well as how to convert any\nMaxPooling operation into a combination of linear layers and ReLU activation functions.\nThe problem described in (2) is still a hard problem. The addition of the ReLU non-linearities (2d)\ntransforms a problem that would have been solvable by simple Linear Programming into an NP-hard\nproblem [11]. Converting a veri\ufb01cation problem into this canonical representation does not make\nits resolution simpler but it does provide a formalism advantage. Speci\ufb01cally, it allows us to prove\ncomplex properties, containing several OR clauses, with a single procedure rather than having to\ndecompose the desired property into separate queries as was done in previous work [11].\nOperationally, a valid strategy is to impose the constraints (2a) to (2d) and minimise the value of \u02c6xn.\nFinding the exact global minimum is not necessary for veri\ufb01cation. However, it provides a measure\nof satis\ufb01ability or unsatis\ufb01ability. If the value of the global minimum is positive, it will correspond to\nthe margin by which the property is satis\ufb01ed.\n3.2 Mixed Integer Programming formulation\nA possible way to eliminate the non-linearities is to encode them with the help of binary variables,\ntransforming the PL-NN veri\ufb01cation problem (2) into a Mixed Integer Linear Program (MIP). This\ncan be done with the use of \u201cbig-M\u201d encoding. The following encoding is from Tjeng & Tedrake\n[18]. Assuming we have access to lower and upper bounds on the values that can be taken by the\ncoordinates of \u02c6xi, which we denote li and ui, we can replace the non-linearities:\nxi \u2264 ui \u00b7 \u03b4i\nxi \u2264 \u02c6xi \u2212 li \u00b7 (1 \u2212 \u03b4i)\n\n(3a)\n(3b)\nIt is easy to verify that \u03b4i[j] = 0 \u21d4 xi[j] = 0 (replacing \u03b4i[j] in Eq. (3a)) and \u03b4i[j] = 1 \u21d4 xi[j] = \u02c6xi[j]\n(replacing \u03b4i[j] in Eq. (3b)).\nBy taking advantage of the feed-forward structure of the neural network, lower and upper bounds li\nand ui can be obtained by applying interval arithmetic [9] to propagate the bounds on the inputs, one\nlayer at a time.\nThanks to this speci\ufb01c feed-forward structure of the problem, the generic, non-linear, non-convex\nproblem has been rewritten into an MIP. Optimization of MIP is well studied and highly ef\ufb01cient\noff-the-shelf solvers exist. As solving them is NP-hard, performance is going to be dependent on the\nquality of both the solver used and the encoding. We now ask the following question: how much\nef\ufb01ciency can be gained by using a bespoke solver rather than a generic one? In order to answer this,\nwe present specialised solvers for the PLNN veri\ufb01cation task.\n4 Branch-and-Bound for Veri\ufb01cation\nAs described in Section 3.1, the veri\ufb01cation problem can be rephrased as a global optimization\nproblem. Algorithms such as Stochastic Gradient Descent are not appropriate as they have no way of\nguaranteeing whether or not a minima is global. In this section, we present an approach to estimate\nthe global minimum, based on the Branch and Bound paradigm and show that several published\nmethods, introduced as examples of Satis\ufb01ability Modulo Theories, \ufb01t this framework.\nAlgorithm 1 describes its generic form. The input domain is repeatedly split into sub-domains (line 7),\nover which lower and upper bounds of the minimum are computed (lines 9-10). The best upper-bound\nfound so far serves as a candidate for the global minimum. Any domain whose lower bound is greater\nthan the current global upper bound can be pruned away as it cannot contain the global minimum\n(line 13, lines 15-17). By iteratively splitting the domains, it is possible to compute tighter lower\nbounds. We keep track of the global lower bound on the minimum by taking the minimum over the\nlower bounds of all sub-domains (line 19). When the global upper bound and the global lower bound\ndiffer by less than a small scalar \u0001 (line 5), we consider that we have converged.\nAlgorithm 1 shows how to optimise and obtain the global minimum. If all that we are interested in is\nthe satis\ufb01ability problem, the procedure can be simpli\ufb01ed by initialising the global upper bound with\n\n3\n\n\f0 (in line 2). Any subdomain with a lower bound greater than 0 (and therefore not eligible to contain\na counterexample) will be pruned out (by line 15). The computation of the lower bound can therefore\nbe replaced by the feasibility problem (or its relaxation) imposing the constraint that the output is\nbelow zero without changing the algorithm. If it is feasible, there might still be a counterexample and\nfurther branching is necessary. If it is infeasible, the subdomain can be pruned out. In addition, if any\nupper bound improving on 0 is found on a subdomain (line 11), it is possible to stop the algorithm as\nthis already indicates the presence of a counterexample.\n\nend if\nif dom_lb < global_ub then\n\nglobal_ub \u2190 dom_ub\nprune_domains(doms, global_ub)\n\nglobal_ub \u2190 inf\nglobal_lb \u2190 \u2212 inf\ndoms \u2190 [(global_lb, domain)]\nwhile global_ub \u2212 global_lb > \u0001 do\n(_ , dom) \u2190 pick_out(doms)\n[subdom_1, . . . , subdom_s] \u2190 split(dom)\nfor i = 1 . . . s do\ndom_ub \u2190 compute_UB(net, subdom_i)\ndom_lb \u2190 compute_LB(net, subdom_i)\nif dom_ub < global_ub then\n\nThe description of the veri\ufb01cation problem\nAlgorithm 1 Branch and Bound\nas optimization and the pseudo-code of Algo-\n1: function BAB(net, domain, \u0001)\nrithm 1 are generic and would apply to veri-\n2:\n\ufb01cation problems beyond the speci\ufb01c case of\n3:\nPL-NN. To obtain a practical algorithm, it is\n4:\nnecessary to specify several elements.\n5:\nA search strategy, de\ufb01ned by the pick_out\n6:\nfunction, which chooses the next domain to\n7:\nbranch on. Several heuristics are possible, for\n8:\nexample those based on the results of previous\n9:\nbound computations. For satis\ufb01able problems or\n10:\n11:\noptimization problems, this allows to discover\n12:\ngood upper bounds, enabling early pruning.\n13:\nA branching rule, de\ufb01ned by the split\n14:\nfunction, which\ndom\n15:\nand\nsubdomain\na\n16:\ni subdom_i = dom and that\n17:\n(subdom_i \u2229 subdom_j) = \u2205, \u2200i (cid:54)= j.\nThis\n18:\n19:\nwill de\ufb01ne the \u201cshape\u201d of the domains, which\n20:\nimpacts the hardness of computing bounds. In\n21:\naddition, choosing the right partition can greatly\n22: end function\nimpact the quality of the resulting bounds.\nBounding methods, de\ufb01ned by the compute_{UB, LB} functions. These procedures estimate respec-\ntively upper bounds and lower bounds over the minimum output that the network net can reach over\na given input domain. We want the lower bound to be as high as possible, so that this whole domain\ncan be pruned easily. This is usually done by introducing convex relaxations of the problem and\nminimising them. On the other hand, the computed upper bound should be as small as possible, so as\nto allow pruning out other regions of the space or discovering counterexamples. As any feasible point\ncorresponds to an upper bound on the minimum, heuristic methods are suf\ufb01cient.\nWe now demonstrate how some published work in the literature can be understood as special case of\nthe branch-and-bound framework for veri\ufb01cation.\n4.1 Reluplex\n\nend if\nend for\nglobal_lb \u2190 min{lb | (lb, dom) \u2208 doms}\n\nsuch that (cid:83)\n\nreturn\n\ndomains.append((dom_lb, subdom_i))\n\ntakes\n\npartition\n\nin\n\na\n\ndomain\n\nend while\nreturn global_ub\n\nKatz et al. [11] present a procedure named Reluplex to verify properties of Neural Network containing\nlinear functions and ReLU activation unit, functioning as an SMT solver using the splitting-on-demand\nframework [1]. The principle of Reluplex is to always maintain an assignment to all of the variables,\neven if some of the constraints are violated.\nStarting from an initial assignment, it attempts to \ufb01x some violated constraints at each step. It\nprioritises \ufb01xing linear constraints ((2a), (2c), (2b) and some relaxation of (2d)) using a simplex\nalgorithm, even if it leads to violated ReLU constraints. If no solution to this relaxed problem\ncontaining only linear constraints exists, the counterexample search is unsatis\ufb01able. Otherwise, either\nall ReLU are respected, which generates a counterexample, or Reluplex attempts to \ufb01x one of the\nviolated ReLU; potentially leading to newly violated linear constraints. This process is not guaranteed\nto converge, so to make progress, non-linearities that get \ufb01xed too often are split into two cases. Two\nnew problems are generated, each corresponding to one of the phases of the ReLU. In the worst\nsetting, the problem will be split completely over all possible combinations of activation patterns, at\nwhich point the sub-problems will all be simple LPs.\nThis algorithm can be mapped to the special case of branch-and-bound for satis\ufb01ability. The search\nstrategy is handled by the SMT core and to the best of our knowledge does not prioritise any domain.\nThe branching rule is implemented by the ReLU-splitting procedure: when neither the upper bound\n\n4\n\n\fof the i-th layer xi[j] = max(cid:0)\u02c6xi[j], 0(cid:1) is split out into two subdomains: {xi[j] = 0, \u02c6xi[j] \u2264 0} and\n\nsearch, nor the detection of infeasibility are successful, one non-linear constraint over the j-th neuron\n{xi[j] = \u02c6xi[j], \u02c6xi[j] \u2265 0}. This de\ufb01nes the type of subdomains produced. The prioritisation of ReLUs\nthat have been frequently \ufb01xed is a heuristic to decide between possible partitions.\nAs Reluplex only deal with satis\ufb01ability, the analogue of the lower bound computation is an over-\napproximation of the satis\ufb01ability problem. The bounding method used is a convex relaxation,\nobtained by dropping some of the constraints. The following relaxation is applied to ReLU units for\nwhich the sign of the input is unknown (li[j] \u2264 0 and ui[j] \u2265 0).\n\nxi = max (\u02c6xi, 0) \u21d2 xi \u2265 \u02c6xi\n\n(4a)\n\nxi \u2265 0\n\n(4b)\n\nxi \u2264 ui.\n\n(4c)\n\nIf this relaxation is unsatis\ufb01able, this indicates that the subdomain cannot contain any counterexample\nand can be pruned out. The search for an assignment satisfying all the ReLU constraints by iteratively\nattempting to correct the violated ReLUs is a heuristic that is equivalent to the search for an upper\nbound lower than 0: success implies the end of the procedure but no guarantees can be given.\n4.2 Planet\nEhlers [6] also proposed an approach based on SMT. Unlike Reluplex, the proposed tool, named\nPlanet, operates by explicitly attempting to \ufb01nd an assignment to the phase of the non-linearities.\nReusing the notation of Section 3.2, it assigns a value of 0 or 1 to each \u03b4i[j] variable, verifying at each\nstep the feasibility of the partial assignment so as to prune infeasible partial assignment early.\nAs in Reluplex, the search strategy is not explicitly encoded and simply enumerates all the domains\nthat have not yet been pruned. The branching rule is the same as for Reluplex, as \ufb01xing the decision\nvariable \u03b4i[j] = 0 is equivalent to choosing {xi[j] = 0, \u02c6xi[j] \u2264 0} and \ufb01xing \u03b4i[j] = 1 is equivalent to\n{xi[j] = \u02c6xi[j], \u02c6xi[j] \u2265 0} . Note however that Planet does not include any heuristic to prioritise which\ndecision variables should be split over.\nPlanet does not include a mechanism for early termination based on a heuristic search of a feasible\npoint. For satis\ufb01able problems, only when a full complete assignment is identi\ufb01ed is a solution re-\nturned. In order to detect incoherent assignments, Ehlers [6] introduces a global linear approximation\nto a neural network, which is used as a bounding method to over-approximate the set of values that\neach hidden unit can take. In addition to the existing linear constraints ((2a), (2c) and (2b)), the\nnon-linear constraints are approximated by sets of linear constraints representing the non-linearities\u2019\nconvex hull. Speci\ufb01cally, ReLUs with input of unknown sign are replaced by the set of equations:\n\n(5a)\n\n(5b)\n\nxi \u2265 0\n\nxi[j] \u2264 ui[j]\n\nxi = max (\u02c6xi, 0) \u21d2 xi \u2265 \u02c6xi\n\n(5c)\nwhere xi[j] corresponds to the value of the j-th coordinate of xi. An illustration of the feasible\ndomain is provided in the supplementary material.\nCompared with the relaxation of Reluplex (4), the Planet relaxation is tighter. Speci\ufb01cally, Eq. (4a)\nand (4b) are identical to Eq. (5a) and (5b) but Eq. (5c) implies Eq. (4c). Indeed, given that \u02c6xi[j]\nis smaller than ui[j], the fraction multiplying ui[j] is necessarily smaller than 1, implying that this\nprovides a tighter bounds on xi[j].\nTo use this approximation to compute better bounds than the ones given by simple interval arithmetic,\nit is possible to leverage the feed-forward structure of the neural networks and obtain bounds one\nlayer at a time. Having included all the constraints up until the i-th layer, it is possible to optimize\nover the resulting linear program and obtain bounds for all the units of the i-th layer, which in turn\nwill allow us to create the constraints (5) for the next layer.\nIn addition to the pruning obtained by the convex relaxation, both Planet and Reluplex make use of\ncon\ufb02ict analysis [14] to discover combinations of splits that cannot lead to satis\ufb01able assignments,\nallowing them to perform further pruning of the domains.\n4.3 Potential improvements\nAs can be seen, previous approaches to neural network veri\ufb01cation have relied on methodologies\ndeveloped in three communities: optimization, for the creation of upper and lower bounds; veri\ufb01cation,\nespecially SMT; and machine learning, especially the feed-forward nature of neural networks for\nthe creation of relaxations. A natural question that arises is \u201cCan other existing literature from\nthese domains be exploited to further improve neural network veri\ufb01cation?\u201d Our uni\ufb01ed branch-and-\nbound formulation makes it easy to answer this question. To illustrate its power, we now provide a\nnon-exhaustive list of suggestions to speed-up veri\ufb01cation algorithms.\n\n\u02c6xi[j] \u2212 li[j]\nui[j] \u2212 li[j]\n\n5\n\n\f2\n\n2\n\nand l0[i(cid:63) ]+u0[i(cid:63) ]\n\nBetter bounding \u2014 While the relaxation proposed by Ehlers [6] is tighter than the one used by\nReluplex, it can be improved further still. Speci\ufb01cally, after a splitting operation, on a smaller domain,\nwe can re\ufb01ne all the li, ui bounds, to obtain a tither relaxation. We show the importance of this in\nthe experiments section with the BaB-relusplit method that performs splitting on the activation like\nPlanet but updates its approximation completely at each step.\nOne other possible area of improvement lies in the tightness of the bounds used. Equation (5) is very\nclosely related to the Mixed Integer Formulation of Equation (3). Indeed, it corresponds to level 0 of\nthe Sherali-Adams hierarchy of relaxations [17]. The proof for this statement can be found in the\nsupplementary material. Stronger relaxations could be obtained by exploring higher levels of the\nhierarchy. This would jointly constrain groups of ReLUs, rather than linearising them independently.\nBetter branching\nThe decision to split on the activation of the ReLU non-linearities made by\nPlanet and Reluplex is intuitive as it provides a clear set of decision variables to \ufb01x. However, it\nignores another natural branching strategy, namely, splitting the input domain. Indeed, it could be\nargued that since the function encoded by the neural networks are piecewise linear in their input,\nthis could result in the computation of highly useful upper and lower bounds. To demonstrate this,\nwe propose the novel BaB-input algorithm: a branch-and-bound method that branches over the\ninput features of the network. Based on a domain with input constrained by Eq. (2a), the split\nfunction would return two subdomains where bounds would be identical in all dimension except for\nthe dimension with the largest length, denoted i(cid:63). The bounds for each subdomain for dimension i(cid:63)\n\u2264 x0[i(cid:63)] \u2264 u0[i(cid:63)]. Based on these tighter\nare given by l0[i(cid:63)] \u2264 x0[i(cid:63)] \u2264 l0[i(cid:63) ]+u0[i(cid:63) ]\ninput bounds, tighter bounds at all layers can be re-evaluated.\nOne of the main advantage of branching over the variables is that all subdomains generated by the\nBaB algorithm when splitting over the input variables end up only having simple bound constraints\nover the value that input variable can take. In order to exploit this property to the fullest, we use\nthe highly ef\ufb01cient lower bound computation approach of Kolter & Wong [13]. This approach was\ninitially proposed in the context of robust optimization. However, our uni\ufb01ed framework opens\nthe door for its use in veri\ufb01cation. Speci\ufb01cally, Kolter & Wong [13] identi\ufb01ed an ef\ufb01cient way of\ncomputing bounds for the type of problems we encounter, by generating a feasible solution to the\ndual of the LP generated by the Planet relaxation. While this bound is quite loose compared to\nthe one obtained through actual optimization, they are very fast to evaluate. We propose a smart\nbranching method BaBSB to replace the longest edge heuristic of BaB-input. For all possible splits,\nwe compute fast bounds for each of the resulting subdomain, and execute the split resulting in the\nhighest lower bound. The intuition is that despite their looseness, the fast bounds will still be useful\nin identifying the promising splits.\nAnother advantage of the branch-and-bound approach is that it\u2019s not dependent on the networks being\npiecewise linear. While methods such as Reluplex, Planet or the MIP encoding depends on the\npiecewise linearity, any type of networks for which an appropriate bounding function can be found\nwill be veri\ufb01able using branch-and-bound. Recent advances on incomplete veri\ufb01cation such as the\nwork of Dvijotham et al. [5] can offer such bounds for activations such as sigmoid or hyperbolic\ntangent.\n5 Experimental setup\nThe problem of PL-NN veri\ufb01cation has been shown to be NP-complete [11]. Meaningful comparison\nbetween approaches therefore needs to be experimental.\n5.1 Methods\nThe simplest baseline we refer to is BlackBox, a direct encoding of Eq. (2) into the Gurobi solver,\nwhich will perform its own relaxation, without taking advantage of the problem\u2019s structure.\nFor the SMT based methods, Reluplex and Planet, we use the publicly available versions [7, 12].\nBoth tools are implemented in C++ and relies on the GLPK library to solve their relaxation. We\nwrote some software to convert in both directions between the input format of both solvers.\nWe also evaluate the potential of using MIP solvers, based on the formulation of Eq. (3). Due to the\nlack of availability of open-sourced methods at the time of our experiments, we reimplemented the\napproach in Python, using the Gurobi MIP solver. We report results for a variant called MIPplanet,\nwhich uses bounds derived from Planet\u2019s convex relaxation rather than simple interval arithmetic. Both\nthe MIP and BlackBox are not treated as simple feasibility problem but are encoded to minimize the\n\n6\n\n\foutput \u02c6xn of Equation (2b), with a callback interrupting the optimization as soon as a negative value\nis found. Additional discussions on encodings of the MIP problem can be found in supplementary\nmaterials.\nIn our benchmark, we include the methods derived from our Branch and Bound analysis. Our\nimplementation follows faithfully Algorithm 1, is implemented in Python and uses Gurobi to solve\nLPs. The pick_out strategy consists in prioritising the domain that currently has the smallest lower\nbound. Upper bounds are generated by randomly sampling points on the considered domain, and\nwe use the convex approximation of Ehlers [6] to obtain lower bounds. As opposed to the approach\ntaken by Ehlers [6] of building a single approximation of the network, we rebuild the approximation\nand recompute all bounds for each sub-domain. This is motivated by the observation shown in\nFigure 1 which demonstrate the signi\ufb01cant improvements it brings, especially for deeper networks.\nFor split, BaB-input performs branching by splitting the input domain in half along its longest\nedge and BaBSB does it by splitting the input domain along the dimension improving the global\nlower bound the most according to the fast bounds of Kolter & Wong [13]. We also include results\nfor the BaB-relusplit variant, where the split method is based on the phase of the non-linearities,\nsimilarly to Planet.\n\n(a)\nApproximation\nCollisionDetection net\n\non\n\na\n\n(b) Approximation on a deep net\nfrom ACAS\n\nFigure 1: Quality of the linear approximation, depending on the size of the input domain. We plot the value of\nthe lower bound as a function of the area on which it is computed (higher is better). The domains are centered\naround the global minimum and repeatedly shrunk. Rebuilding completely the linear approximation at each step\nallows to create tighter lower-bounds thanks to better li and ui, as opposed to using the same constraints and\nonly changing the bounds on input variables. This effect is even more signi\ufb01cant on deeper networks.\n5.2 Evaluation Criteria\nFor each of the data sets, we compare the different methods using the same protocol. We attempt to\nverify each property with a timeout of two hours, and a maximum allowed memory usage of 20GB,\non a single core of a machine with an i7-5930K CPU. We measure the time taken by the solvers to\neither prove or disprove the property. If the property is false and the search problem is therefore\nsatis\ufb01able, we expect from the solver to exhibit a counterexample. If the returned input is not a valid\ncounterexample, we don\u2019t count the property as successfully proven, even if the property is indeed\nsatis\ufb01able. All code and data necessary to replicate our analysis are released.\n6 Analysis\nWe attempt to perform veri\ufb01cation on three data sets of properties and report the comparison results.\nThe dimensions of all the problems can be found in the supplementary material.\nThe CollisionDetection data set [6] attempts to predict whether two vehicles with parameterized\ntrajectories are going to collide. 500 properties are extracted from problems arising from a binary\nsearch to identify the size of the region around training examples in which the prediction of the\nnetwork does not change. The network used is relatively shallow but due to the process used to\ngenerate the properties, some lie extremely close between the decision boundary between SAT and\nUNSAT. Results presented in Figure 2 therefore highlight the accuracy of methods.\nThe Airborne Collision Avoidance System (ACAS) data set, as released by Katz et al. [11] is a\nneural network based advisory system recommending horizontal manoeuvres for an aircraft in order\nto avoid collisions, based on sensor measurements. Each of the \ufb01ve possible manoeuvres is assigned\n\n7\n\n10\u22121210\u2212910\u2212610\u22123100Relative area32.032.533.033.534.034.535.0Lower boundReApproximatingFixedApproximation10\u2212610\u2212510\u2212410\u2212310\u2212210\u22121100Relative area\u2212105\u2212104\u2212103\u2212102\u2212101\u22121000100101102103Lower boundReApproximatingFixedApproximation\fa score by the neural network and the action with the minimum score is chosen. The 188 properties\nto verify are based on some speci\ufb01cation describing various scenarios. Due to the deeper network\ninvolved, this data set is useful in highlighting the scalability of the various algorithms.\nExisting data sets do not allow us to explore the impact on the performance of different methods of\nvarious problem/model parameters such as depth, number of hidden units, and input dimensionality.\nOur new data set, PCAMNIST, removes this de\ufb01ciency, and can prove helpful in analysing future\nveri\ufb01cation approaches as well. It is generated in a way to give control over different architecture\nparameters. Details of the dataset construction are given in supplementary materials. We present\nplots in Figure 4 showing the evolution of runtimes depending on each of the architectural parameters\nof the networks.\n\n(a) CollisionDetection Dataset\n\n(b) ACAS Dataset\n\nFigure 2: Proportion of properties veri\ufb01able for varying time budgets depending on the methods employed.\nA higher curve means that for the same time budget, more properties will be solvable. All methods solve\nCollisionDetection quite quickly except reluBaB, which is much slower and BlackBox who produces several\nincorrect counterexamples.\n\nMethod\n\nBaBSB\nBaB\nreluBaB\nreluplex\nMIPplanet\nplanet\n\nAverage\n\ntime\n\nper Node\n\n1.81s\n2.11s\n1.69s\n0.30s\n0.017s\n1.5e-3s\n\nTable 1: Average time to\nexplore a node for each\nmethod.\n\n(a) Properties solved for a given number of\nnodes to explore\n\n(log scale).\n\nFigure 3: The trade-off taken\nby the various methods are differ-\nent. Figure 3a shows how many\nsubdomains needs to be explored\nbefore verifying properties while\nTable 1 shows the average time\ncost of exploring each subdomain.\nOur methods have a higher cost\nper node but they require signif-\nicantly less branching, thanks to\nbetter bounding. Note also that\nbetween BaBSB and BaB, the\nsmart branching reduces by an or-\nder of magnitude the number of\nnodes to visit.\n\nComparative evaluation of veri\ufb01cation approaches \u2014 In Figure 2, on the shallow networks of\nCollisionDetection, most solvers succeed against all properties in about 10s. In particular, the SMT\ninspired solvers Planet, Reluplex and the MIP solver are extremely fast.\nOn the deeper networks of ACAS, in Figure 2b, no errors are observed but most methods timeout on\nthe most challenging testcases. The best baseline is Reluplex, who reaches 79.26% success rate at\nthe two hour timeout, while our best method, BaBSB, already achieves 98.40% with a budget of one\nhour. To reach Reluplex\u2019s success rate, the required runtime is two orders of magnitude smaller.\nImpact of each improvement \u2014 To identify which changes allow our method to have good\nperformance, we perform an ablation study and study the impact of removing some components of\nour methods. The only difference between BaBSB and BaB is the smart branching, which represents\na signi\ufb01cant part of the performance gap.\n\n8\n\n10\u22121100101102103Computation time (in s)020406080100% of properties verifiedBaBSBBaBreluBaBreluplexMIPplanetplanetBlackBox0200040006000Computation time (in s)020406080100% of properties verified100101102103104105106Number of Nodes visited020406080100% of properties verified\fBranching over the inputs rather than over the activations does not contribute much, as shown by the\nsmall difference between BaB and reluBaB. Note however that we are able to use the fast methods\nof Kolter & Wong [13] for the smart branching because branching over the inputs makes the bounding\nproblem similar to the one solved in robust optimization. Even if it doesn\u2019t improve performance by\nitself, the new type of split enables the use of smart branching.\nThe rest of the performance gap can be attributed to using a better bounds: reluBaB signi\ufb01cantly\noutperforms planet while using the same branching strategy and the same convex relaxations. The\nimprovement comes from the bene\ufb01ts of rebuilding the approximation at each step shown in Figure 1.\nFigure 3 presents some additional analysis on a 20-property subset of the ACAS dataset, showing\nhow the methods used impact the need for branching. Smart branching and the use of better lower\nbounds reduce heavily the number of subdomains to explore.\n\n(a) Number of inputs\n\n(b) Layer width\n\n(c) Margin\n\n(d) Network depth\n\nFigure 4:\nImpact of the various parameters over the runtimes of the different solvers. The base network has 10\ninputs and 4 layers of 25 hidden units, and the property to prove is True with a margin of 1000. Each of the plot\ncorrespond to a variation of one of this parameters.\nIn the graphs of Figure 4, the trend for all the methods are similar, which seems to indicate that\nhard properties are intrinsically hard and not just hard for a speci\ufb01c solver. Figure 4a shows an\nexpected trend: the largest the number of inputs, the harder the problem is. Similarly, Figure 4b\nshows unsurprisingly that wider networks require more time to solve, which can be explained by the\nfact that they have more non-linearities. The impact of the margin, as shown in Figure 4c is also clear.\nProperties that are True or False with large satis\ufb01ability margin are easy to prove, while properties\nthat have small satis\ufb01ability margins are signi\ufb01cantly harder.\n7 Conclusion\nThe improvement of formal veri\ufb01cation of Neural Networks represents an important challenge to be\ntackled before learned models can be used in safety critical applications. By providing both a uni\ufb01ed\nframework to reason about methods and a set of empirical benchmarks to measure performance with,\nwe hope to contribute to progress in this direction. Our analysis of published algorithms through the\nlens of Branch and Bound optimization has already resulted in signi\ufb01cant improvements in runtime\non our benchmarks. Its continued analysis should reveal even more ef\ufb01cient algorithms in the future.\n\n9\n\n101102103Number of Inp ts10\u2212210\u22121100101102103104Timing (in s.)Timeo tBaBSBBaBrel BaBrel plexMIPplanetplanetBlackBox101102Layer Width10\u2212210\u22121100101102103104Timing (in s.)Timeout\u2212104\u2212102\u2212100100102104Satisfiability margin10\u2212210\u22121100101102103104Timing (in s.)SAT / FalseUNSAT / TrueTimeout2\u00d71003\u00d71004\u00d71006\u00d7100Layer Depth10\u2212210\u22121100101102103104Timing (in s.)Timeout\f8 Acknowldgments\nThis work was supported by ERC grant ERC-2012-AdG 321162-HELIOS, EPSRC grant Seebibyte\nEP/M013774/1 and EPSRC/MURI grant EP/N019474/1. We would also like to acknowledge the\nRoyal Academy of Engineering and FiveAI.\nReferences\n[1] Barrett, Clark, Nieuwenhuis, Robert, Oliveras, Albert, and Tinelli, Cesare. Splitting on de-\nmand in sat modulo theories. International Conference on Logic for Programming Arti\ufb01cial\nIntelligence and Reasoning, 2006.\n\n[2] Bastani, Osbert, Ioannou, Yani, Lampropoulos, Leonidas, Vytiniotis, Dimitrios, Nori, Aditya,\n\nand Criminisi, Antonio. Measuring neural net robustness with constraints. NIPS, 2016.\n\n[3] Buxton, John N and Randell, Brian. Software Engineering Techniques: Report on a Conference\n\nSponsored by the NATO Science Committee. NATO Science Committee, 1970.\n\n[4] Cheng, Chih-Hong, N\u00fchrenberg, Georg, and Ruess, Harald. Veri\ufb01cation of binarized neural\n\nnetworks. arXiv:1710.03107, 2017.\n\n[5] Dvijotham, Krishnamurthy, Stanforth, Robert, Gowal, Sven, Mann, Timothy, and Kohli, Push-\n\nmeet. A dual approach to scalable veri\ufb01cation of deep networks. UAI, 2018.\n\n[6] Ehlers, Ruediger. Formal veri\ufb01cation of piece-wise linear feed-forward neural networks.\n\nAutomated Technology for Veri\ufb01cation and Analysis, 2017.\n\n[7] Ehlers, Ruediger. Planet. https://github.com/progirep/planet, 2017.\n[8] Hein, Matthias and Andriushchenko, Maksym. Formal guarantees on the robustness of a\n\nclassi\ufb01er against adversarial manipulation. NIPS, 2017.\n\n[9] Hickey, Timothy, Ju, Qun, and Van Emden, Maarten H. Interval arithmetic: From principles to\n\nimplementation. Journal of the ACM (JACM), 2001.\n\n[10] Huang, Xiaowei, Kwiatkowska, Marta, Wang, Sen, and Wu, Min. Safety veri\ufb01cation of deep\n\nneural networks. International Conference on Computer Aided Veri\ufb01cation, 2017.\n\n[11] Katz, Guy, Barrett, Clark, Dill, David, Julian, Kyle, and Kochenderfer, Mykel. Reluplex: An\n\nef\ufb01cient smt solver for verifying deep neural networks. CAV, 2017.\n\n[12] Katz, Guy, Barrett, Clark, Dill, David, Julian, Kyle, and Kochenderfer, Mykel. Reluplex.\n\nhttps://github.com/guykatzz/ReluplexCav2017, 2017.\n\n[13] Kolter, Zico and Wong, Eric. Provable defenses against adversarial examples via the convex\n\nouter adversarial polytope. arXiv:1711.00851, 2017.\n\n[14] Marques-Silva, Jo\u00e3o P and Sakallah, Karem A. Grasp: A search algorithm for propositional\n\nsatis\ufb01ability. IEEE Transactions on Computers, 1999.\n\n[15] Narodytska, Nina, Kasiviswanathan, Shiva Prasad, Ryzhyk, Leonid, Sagiv, Mooly, and Walsh,\n\nToby. Verifying properties of binarized deep neural networks. arXiv:1709.06662, 2017.\n\n[16] Pulina, Luca and Tacchella, Armando. An abstraction-re\ufb01nement approach to veri\ufb01cation of\n\narti\ufb01cial neural networks. CAV, 2010.\n\n[17] Sherali, Hanif D and Adams, Warren P. A hierarchy of relaxations and convex hull characteri-\nzations for mixed-integer zero\u2014one programming problems. Discrete Applied Mathematics,\n1994.\n\n[18] Tjeng, Vincent and Tedrake, Russ. Verifying neural networks with mixed integer programming.\n\narXiv:1711.07356, 2017.\n\n[19] Xiang, Weiming, Tran, Hoang-Dung, and Johnson, Taylor T. Output reachable set estimation\n\nand veri\ufb01cation for multi-layer neural networks. arXiv:1708.03322, 2017.\n\n[20] Zakrzewski, Radosiaw R. Veri\ufb01cation of a trained neural network accuracy. IJCNN, 2001.\n\n10\n\n\f", "award": [], "sourceid": 2329, "authors": [{"given_name": "Rudy", "family_name": "Bunel", "institution": "Oxford University"}, {"given_name": "Ilker", "family_name": "Turkaslan", "institution": "University of Oxford"}, {"given_name": "Philip", "family_name": "Torr", "institution": "University of Oxford"}, {"given_name": "Pushmeet", "family_name": "Kohli", "institution": "DeepMind"}, {"given_name": "Pawan", "family_name": "Mudigonda", "institution": "University of Oxford"}]}