{"title": "Complexity Issues in Neural Computation and Learning", "book": "Advances in Neural Information Processing Systems", "page_first": 1161, "page_last": 1162, "abstract": null, "full_text": "Complexity Issues in Neural \nComputation and Learning \n\nV. P. Roychowdhnry \nSchool of Electrical Engineering \nPurdue University \nWest Lafayette, IN 47907 \nEmail: vwani@ecn.purdue.edu \n\nK.-Y. Sin \nDept.. of Electrical & Compo Engr. \nU ni versit.y of California at Irvine \nIrvine, CA 92717 \nEmail: siu@balboa.eng.uci.edu \n\nThe general goal of this workshop was to bring t.ogether researchers working toward \ndeveloping a theoretical framework for the analysis and design of neural networks. \nThe t.echnical focus of the workshop was to address recent. developments in under(cid:173)\nstanding the capabilities and limitations of variolls modds for neural computation \nand learning. The primary topics addressed the following three areas: 1) Com(cid:173)\nputational complexity issues in neural networks, 2) Complexity issues in learning, \nand 3) Convergence and numerical properties of learning algorit.hms. Other top(cid:173)\nics included experiment.al/simulat.ion results on neural llet.works, which seemed to \npose some open problems in the areas of learning and generalizat.ion properties of \nfeedforward networks. \n\nThe presentat.ions and discussions at the workshop highlighted the int.erdisciplinary \nnature of research in neural net.works . For example, several of the present.at.ions \ndiscussed recent contributions which have applied complexity-theoretic techniques \nto characterize the computing power of neural net.works, t.o design efficient neural \nnetworks, and t.o compare the computational capabilit.ies of neural net.works wit.h \nthat. of convent.ional models for comput.ation. Such st.udies, in t.urn, have generated \nconsiderable research interest. among computer scient.ists, as evidenced by a signifi(cid:173)\ncant number of research publications on related topics . A similar development can \nbe observed in t.he area of learning as well: Techniques primarily developed in the \nclassical theory of learning are being applied to understand t.he generalization and \nlearning characteristics of neural networks. In [1, 2] attempts have been made to in(cid:173)\ntegrate concept.s from different areas and present a unifie(i treatment of the various \nresults on the complexity of neural computation ancllearning. In fact, contributions \nfrom several part.icipants in the workshop are included in [2], and interested readers \ncould find det.ailed discussions of many of the n-~sults IHesented at t.he workshop in \n[2] . \nFollowing is a brief descriptioll of the present.ations, along with the Hames and e(cid:173)\nmail addresses of the speakers. W. Maass (maa.~.~@igi . tu-gT\u00b7(Jz.(!(\" . at) and A . Sakurai \n(sakllmi@hadgw92.lwd.hitachi.co.,ip) made preseutatiol1s Oll tlw VC-dimension and \nt.he comput.ational power of feedforwarcl neural net.works . Many neural net.s of depth \n3 (or larger) with linear threshold gat.es have a VC-dimf'usion t.hat. is superlinear in \nt.he number of weights of the net. The talks presPllted llPW results which establish \n\n1161 \n\n\f1162 \n\nRoychowdhury and Siu \n\neffective upper bounds and almost. t.ight lower boun(ls on t.he VC-dimension of \nfeedforward networks with various activation functions including linear threshold \nand sigmoidal functions. Such nonlinear lower bounds on t.he VC-dimension were \nalso discussed for networks with bot.h integer and rea.l weights . A presentation \nby G. Turan (@VM.CC.PURDUE.EDU:Ul1557@UICVM) discussed new result.s on \nproving lower bounds on t.he size of circuits for comput.ing specific Boolean functions \nwhere each gate comput.es a real-valued function. In particular the results provide \na lower bound for t.he size of formulas (i.e., circuit.s wit.h fan-out 1) of polynomial \ngates, computing Boolean func.t.ions in t.he sensp. of sign-representation. \n\nThe presentations on learning addressed both sample allli algorithmic complexity. \nThe t.alk by V. Cast.elli (vittor\u00b7io@i81.stanford.edu) and T. Cover st.udip.d the role of \nlabeled and unlabeled samples in pat.tern recognit.ion. Let. samples be chosen from \ntwo populations whose distribut.ions are known, and ld the proport.ion (mixing pa(cid:173)\nrameter) of the two classes be unknown. Assume t.hat a t.raining set composed of \nindependent observations from the t.wo classes is given, where part. of the samples \nare classified and part are not. The talk present.ed new rt~sults which investigate the \nrelative value of the labeled and unlabeled samples in reducing the probability of \nerror of the classifier. In particular, it was shown that. uuder the above hypotheses \nt.he relative value of labeled and unlabeled samples is proportional t.o the (Fisher) \nInformat.ion they carry about, the unknown mixing parameter. B. Dasgupta (das(cid:173)\ngupta@cs.umn.ed1l), on the othE'r hand, addressed tlw issue of the trad.ability of \nthe t.raining problem of neural net.works. New rp.sults showing tha.t. the training \nproblem remains NP-complete when the act.iva.t.ion functions are piecewise linear \nwere presented. \n\nThe talk by B. Hassibi (hassibi@msCClls.stan/oni.uill.) provided a minimax interpre(cid:173)\ntation of instant.aneous-gradient-based learning algorit.hms such as LMS and back(cid:173)\npropagation. When t.he underlying model is linear, it was shown t.hat the LMS \nalgorithm minimizes the worst C3.<;e ratio of pl'f~clicted error energy to disturbance \nenergy. When the model is nonlinear, which arises in t.hE' contp.xt. of neural net.works, \nit was shown that t.he backpropagation algorithm performs this minimizat.ion in a \nlocal sense. These results provide theoretical justificat.ioll for the widely observed \nexcellent robustness properties of the LMS and backpropagatioll algorithms. \n\nThe last. t.alk by R. Caruana (car\u00b7ltana@GS79.SP.Ch'.CMU.EDU) presented a set. \nof int.eresting empirical results on the learning properties of neural networks of \ndifferent sizes. Some of the issues (based on empirical evidence) raised during \nthe talk are: 1) If cross-validation is used to prevent overt.raining, excess capacity \nrarely reduces the generalization performance of fully connected feed-forward back(cid:173)\npropagation net.works. 2) Moreover, too little capacity usn ally hurt.s generalization \nperformance more than too much capacit.y. \n\nReferences \n\n[1] K.-Y . Siu, V. P. Roychowdhnry, and T. Kailath. Di.H:r'fi(; Nfllml Computation: \n\nA Theordical Foundation. Englewood Cliffs, N.1: Prent.ice-H all , 1994. \n\n[2] V. P. Roychowdhury, K.-Y. Siu, and A. Orlitsky, edit.ors. ThwT'(;tical Advances \nin N(;uT'ai Compltiation and LUlT'Tl.ing. Bost.on: Kluwer Academic Publishers, \n1994. \n\n\f", "award": [], "sourceid": 844, "authors": [{"given_name": "V. P.", "family_name": "Roychowdhury", "institution": null}, {"given_name": "K.-Y.", "family_name": "Siu", "institution": null}]}