NeurIPS 2020

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

Meta Review

The reviewers generally agree this paper has great execution, a great idea, and great results. The reviewers noted the impact that self-supervised learning on video can have, which has been less explored than the image counterpart. The reviewers also praised the strong empirical results, which will be of high interest to the community. The clear visualizations and strong ablation experiments further support the claims in the paper. Congratulations!