Many belief networks have been proposed that are composed of binary units. However, for tasks such as object and speech recog(cid:173) nition which produce real-valued data, binary network models are usually inadequate. Independent component analysis (ICA) learns a model from real data, but the descriptive power of this model is severly limited. We begin by describing the independent factor analysis (IFA) technique, which overcomes some of the limitations of ICA. We then create a multilayer network by cascading single(cid:173) layer IFA models. At each level, the IFA network extracts real(cid:173) valued latent variables that are non-linear functions of the input data with a highly adaptive functional form, resulting in a hier(cid:173) archical distributed representation of these data. Whereas exact maximum-likelihood learning of the network is intractable, we de(cid:173) rive an algorithm that maximizes a lower bound on the likelihood, based on a variational approach.