{"title": "On the Power of Neural Networks for Solving Hard Problems", "book": "Neural Information Processing Systems", "page_first": 137, "page_last": 143, "abstract": null, "full_text": "137 \n\nOn the Power of Neural Networks for \n\nSolving Hard Problems \n\nJ ehoshua Bruck \n\nJoseph W. Goodman \n\nInformation Systems Laboratory \n\nDepartmen t of Electrical Engineering \n\nStanford University \nStanford, CA 94305 \n\nAbstract \n\nThis paper deals with a neural network model in which each neuron \nperforms a threshold logic function. An important property of the model \nis that it always converges to a stable state when operating in a serial \nmode [2,5]. This property is the basis of the potential applications of the \nmodel such as associative memory devices and combinatorial optimization \n[3,6]. \nOne of the motivations for use of the model for solving hard combinatorial \nproblems is the fact that it can be implemented by optical devices and \nthus operate at a higher speed than conventional electronics. \nThe main theme in this work is to investigate the power of the model for \nsolving NP-hard problems [4,8], and to understand the relation between \nspeed of operation and the size of a neural network. In particular, it will \nbe shown that for any NP-hard problem the existence of a polynomial \nsize network that solves it implies that NP=co-NP. Also, for Traveling \nSalesman Problem (TSP), even a polynomial size network that gets an \n\u20ac-approximate solution does not exist unless P=NP. \n\nThe above results are of great practical interest, because right now it is \npossible to build neural networks which will operate fast but are limited \nin the number of neurons. \n\n1 Background \n\nThe neural network model is a discrete time system that can be represented by \na weighted and undirected graph. There is a weight attached to each edge of \nthe graph and a threshold value attached to each node (neuron) of the graph. \n\n\u00a9 American Institute of Physics 1988 \n\n\f138 \n\nThe order of the network is the number of nodes in the corresponding graph. \nLet N be a neural network of order n; then N is uniquely defined by (W, T) \nwhere: \n\n\u2022 W is an n X n symmetric matrix, Wii is equal to the weight attached to \n\nedge (i, j) . \n\n\u2022 T is a vector of dimension n, Ti denotes the threshold attached to node i. \n\nEvery node (neuron) can be in one of two possible states, either 1 or -1. The \nstate of node i at time t is denoted by Vi(t). The state of the neural network at \ntime t is the vector V(t). \n\nThe next state of a node is computed by: \n\nVi(t + 1) = sgn(H,(t)) = { ~1 ~t~;2i~ 0 \n\n(1) \n\nwhere \n\nn \n\nHi(t) = L WiiVj(t) - Ti \n\ni=l \n\nThe next state of the network, i.e. V(t + 1), is computed from the current \nstate by performing the evaluation (1) at a subset of the nodes of the network, \nto be denoted by S. The modes of operation are determined by the method \nby which the set S is selected in each time interval. If the computation is \nperformed at a single node in any time interval, i.e. 1 S 1= 1, then we will say \nthat the network is operating in a serial mode; if 1 S 1= n then we will say that \nthat the network is operating in a fully parallel mode. All the other cases, i.e. \n1 * 0 being some fixed number. We would like to have a \nnetwork N x~ in which every local maximum is an f-approximate of the global \nand that the global corresponds to an optimum of X. The network N x\u20ac should \nbe small, namely, 1 N x~ 1 should be bounded by a polynomial in 1 X I. Also, \nwe would like to have an algorithm AL~, such that, given an instance X E L, it \ngenerates the description for N x\u20ac \nNote that in both the exact case and the approximate case we do not put any \nrestriction on the time it takes the network to converge to a solution (it can be \nexponential) . \n\nin polynomial (in 1 X I) time. \n\nA t this point the reader should convince himself that the above description is \nwhat he imagined as the setup for using the neural network model for solving \nhard problems, because that is what the following definition is about. \n\nDefinition: We will say that a neural network for solving (or finding an f(cid:173)\napproximation of) a problem L exists if the algorithm AL (or ALJ which gen(cid:173)\nerates the description of N x (or Nx~) exists. \n\nThe main results in the paper are summarized by the following two propo(cid:173)\n\nsitions. The first one deals with exact solutions of NP-hard problems while the \nsecond deals with approximate solutions to TSP. \n\nProposition 1 Let L be an NP-hard problem. Then the existence of a neural \nnetwork for solving L implies that NP = co-NP. \nProposition 2 Let f > 0 be some fixed number. The existence of a neural \nnetwork for finding an f-approximate solution to TSP implies that P=NP. \n\nBoth (P=NP) and (NP=co-NP) are believed to be false statements, hence, \n\nwe can not use the model in the way we imagine. \n\nThe key observation for proving the above propositions is the fact that a \nsingle iteration in a neural network takes time which is bounded by a polynomial \nin the size of the instance of the corresponding problem. The proofs of the above \ntwo propositions follow directly from known results in complexity theory and \nshould not be considered as new results in complexity theory. \n\n\f141 \n\n3 The Proofs \n\nProof of Proposition 1: The proof follows from the definition of the classes \nNP and co-NP, and Lemma 1. The definitions and the lemma appear in Chap(cid:173)\nters 15 and 16 in [8] and also in Chapters 2 and 7 in [4]. \n\nLemma 1 If the complement of an NP-complete problem is in NP, \nthen NP=co-NP. \n\nLet L be an NP-hard problem. Suppose there exists a neural network that solves \nL. Let 1 be an NP-complete problem. By definition, 1 can be polynomialy \nreduced to L. Thus, for every instance X E 1, we have a neural network such \nthat from any of its global maxima we can efficiently recognize whether X is a \n'yes' or a 'no' instance of 1. \nWe claim that we have a nondeterministic polynomial time algorithm to decide \nthat a given instance X E 1 is a 'no' instance. Here is how we do it: for X E 1 \nwe construct the neural network that solves it by using the reduction to L. We \nthen check every state of the network to see if it is a local maximum (that is \ndone in polynomial time). In case it is a local maximum, we check if the instance \nis a 'yes' or a 'no' instance (this is also done in polynomial time). \nThus, we have a nondeterministic polynomial time algorithm to recognize any \n'no' instance of 1. Thus, the complement of the problem 1 is in NP. But 1 is \nan NP-complete problem, hence, from Lemma 1 it follows that NP=co-NP. 0 \n\nProof of Proposition 2: The result is a corollary of the results in [7], the \nreader can refer to it for a more complete presentation. \nThe proof uses the fact that the Restricted Hamiltonian Circuit (RHC) is an \nNP-complete problem. \nDefiniton of RHC: Given a graph G = (V, E) and a Hamiltonian path in G. \nThe question is whether there is a Hamiltonian circuit in G? \nIt is proven in [7] that RHC is NP-complete. \nSuppose there exists a polynomial size neural network for finding an \nf-approximate solution to TSP. Then it can be shown that an instance X E \nRHC can be reduced to an instance X E TSP, such that in the network Nx\u00a3 \nthe following holds: if the Hamiltonian path that is given in X corresponds to a \nlocal maximum in N x\u00a3 then X is a 'no' instance; else, if it does not correspond \nto a local maximum in N x\u00a3 then X is a 'yes' instance. Note that we can check \nfor locality in polynomial time. \nHence, the existence of N xe for all X E TSP implies that we have a polynomial \ntime algorithm for RHC. 0 \n\n\f142 \n\n4 Concluding Remarks \n\n1. In Proposition 1 we let I W I and I T I be arbitrary but bounded by a \npolynomial in the size of a given instance of a problem. If we assume \nthat I W I and I T I are fixed for all instances then a similar result to \nProposition 1 can be proved without using complexity theory; this result \nappears in [1]. \n\n2. The network which corresponds to TSP, as suggested in [6], can not solve \nthe TSP with guaranteed quality. However, one should note that all the \nanalysis in this paper is a worst case type of analysis. So, it might be that \nthere exist networks that have good behavior on the average. \n\n3. Proposition 1 is general to all NP-hard problems while Proposition 2 is \nspecific to TSP. Both propositions hold for any type of networks in which \nan iteration takes polynomial time. \n\n4. Clearly, every network has an algorithm which is equivalent to it, but an \nalgorithm does not necessarily have a corresponding network. Thus, if we \ndo not know of an algorithmic solution to a problem we also will not be able \nto find a network which solves the problem. If one believes that the neural \nnetwork model is a good model (e.g. it is amenable to implementation with \noptics), one should develop techniques to program the network to perform \nan algorithm that is known to have some guaranteed good behavior. \n\nAcknowledgement: Support of the U.S. Air Force Office of Scientific Research \nis gratefully acknowledged. \n\nReferences \n\n[1] Y. Abu Mostafa, Neural Networks for Computing? in Neural Networks \nfor Computing, edited by J. Denker (AlP Conference Proceedings no. 151, \n1986). \n\n[2] J. Bruck and J. Sanz, A Study on Neural Networks, IBM Tech Rep, RJ \n5403, 1986. To appear in International Journal of Intelligent Systems, 1988. \n\n[3] J. Bruck and J. W. Goodman, A Generalized Convergence Theorem for \nNeural Networks and its Applications in Combinatorial Optimization, IEEE \nFirst ICNN, San-Diego, June 1987. \n\n[4] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to \n\nthe Theory of NP-Completeness, W. H. Freeman and Company, 1979. \n\n\f143 \n\n[5] J. J. Hopfield, Neural Networks and Physical Systems with Emergent Col(cid:173)\nlective Computational Abilities, Proc. Nat. Acad. Sci .. USA, Vol. 79, pp. \n2554-2558, 1982. \n\n[6] J. J. Hopfield and D. W. Tank, Neural Computations of Decisions in Op(cid:173)\n\ntimization Problems, BioI. Cybern. 52, pp. 141-152, 1985. \n\n[7] C. H. Papadimitriou and K. Steiglitz, On the Complexity of Local Search \nfor the Traveling Salesman Problem, SIAM J. on Comp., Vol. 6, No.1, pp. \n76-83, 1977. \n\n[8] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algo:(cid:173)\n\nrithms and Complexity, Prentice-Hall, Inc., 1982. \n\n[9] J. C. Picard and H. D. Ratliff, Minimum Cuts and Related Problems, Net(cid:173)\n\nworks, Vol 5, pp. 357-370, 1974. \n\n\f", "award": [], "sourceid": 70, "authors": [{"given_name": "Jehoshua", "family_name": "Bruck", "institution": null}, {"given_name": "Joseph", "family_name": "Goodman", "institution": null}]}*