Efficient Clustering Based On A Unified View Of $K$-means And Ratio-cut

Part of Advances in Neural Information Processing Systems 33 pre-proceedings (NeurIPS 2020)

Bibtex »Paper »Supplemental »

Bibtek download is not availble in the pre-proceeding


Authors

Shenfei Pei, Feiping Nie, Rong Wang, Xuelong Li

Abstract

Spectral clustering and $k$-means, both as two major traditional clustering methods, are still attracting a lot of attention, although a variety of novel clustering algorithms have been proposed in recent years. Firstly, a unified framework of $k$-means and ratio-cut is revisited, and a novel and efficient clustering algorithm is then proposed based on this framework. The time and space complexity of our method are both linear with respect to the number of samples, and are independent of the number of clusters to construct, more importantly. These properties mean that it is easily scalable and applicable to large practical problems. Extensive experiments on 12 real-world benchmark and 8 facial datasets validate the advantages of the proposed algorithms compared to the state-of-the-art clustering algorithms. In particular, over 15x and 7x speed-up can be obtained with respect to $k$-means on the synthetic dataset of 1 million samples and the benchmark dataset (CelebA) of 200k samples, respectively [GitHub].