Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Tarik Dzanic, Karan Shah, Freddie Witherden
Advancements in deep generative models such as generative adversarial networks and variational autoencoders have resulted in the ability to generate realistic images that are visually indistinguishable from real images which raises concerns about their potential malicious usage. In this paper, we present an analysis of the high-frequency Fourier modes of real and deep network generated images and show that deep network generated images share an observable, systematic shortcoming in replicating the attributes of these high-frequency modes. Using this, we propose a novel detection method based on the frequency spectrum of the images which is able to achieve an accuracy of up to 99.2% in classifying real and deep network generated images from various GAN and VAE architectures on a dataset of 5000 images with as few as 8 training examples. Furthermore, we show the impact of image transformations such as compression, cropping, and resolution reduction on the classification accuracy and suggest a method for modifying the high-frequency attributes of deep network generated images to mimic real images.