Part of Advances in Neural Information Processing Systems 23 (NIPS 2010)
Kamalika Chaudhuri, Sanjoy Dasgupta
For a density f on R^d, a high-density cluster is any connected component of {x: f(x) >= c}, for some c > 0. The set of all high-density clusters form a hierarchy called the cluster tree of f. We present a procedure for estimating the cluster tree given samples from f. We give finite-sample convergence rates for our algorithm, as well as lower bounds on the sample complexity of this estimation problem.