Part of Advances in Neural Information Processing Systems 10 (NIPS 1997)
Van H. Vu H. Vu
We demonstrate that the problem of training neural networks with small (average) squared error is computationally intractable. Con(cid:173) sider a data set of M points (Xi, Yi), i = 1,2, ... , M, where Xi are input vectors from Rd, Yi are real outputs (Yi E R). For a net- work 10 in some class F of neural networks, (11M) L~l (fO(Xi)(cid:173) Yi)2)1/2 - inlfEF(l/ M) "2:f!1 (f(Xi) - YJ2)1/2 is the (avarage) rel(cid:173) ative error occurs when one tries to fit the data set by 10. We will prove for several classes F of neural networks that achieving a rela(cid:173) tive error smaller than some fixed positive threshold (independent from the size of the data set) is NP-hard.