We consider feed-forward neural networks with one non-linear hidden layer and linear output units. The transfer function in the hidden layer are ei(cid:173) ther bell-shaped or sigmoid. In the bell-shaped case, we show how Bern(cid:173) stein polynomials on one hand and the theory of the heat equation on the other are relevant for understanding the properties of the corresponding networks. In particular, these techniques yield simple proofs of universal approximation properties, i.e. of the fact that any reasonable function can be approximated to any degree of precision by a linear combination of bell(cid:173) shaped functions. In addition, in this framework the problem of learning is equivalent to the problem of reversing the time course of a diffusion pro(cid:173) cess. The results obtained in the bell-shaped case can then be applied to the case of sigmoid transfer functions in the hidden layer, yielding similar universality results. A conjecture related to the problem of generalization is briefly examined.