Ian Goodfellow, Honglak Lee, Quoc Le, Andrew Saxe, Andrew Ng
For many computer vision applications, the ideal image feature would be invariant to multiple confounding image properties, such as illumination and viewing angle. Recently, deep architectures trained in an unsupervised manner have been proposed as an automatic method for extracting useful features. However, outside of using these learning algorithms in a classiﬁer, they can be sometimes difﬁcult to evaluate. In this paper, we propose a number of empirical tests that directly measure the degree to which these learned features are invariant to different image transforms. We ﬁnd that deep autoencoders become invariant to increasingly complex image transformations with depth. This further justiﬁes the use of “deep” vs. “shallower” representations. Our performance metrics agree with existing measures of invariance. Our evaluation metrics can also be used to evaluate future work in unsupervised deep learning, and thus help the development of future algorithms.