This paper analyses the Contrastive Divergence algorithm for learning statistical parameters. We relate the algorithm to the stochastic approxi- mation literature. This enables us to specify conditions under which the algorithm is guaranteed to converge to the optimal solution (with proba- bility 1). This includes necessary and sufficient conditions for the solu- tion to be unbiased.