Part of Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
Quentin Mérigot, Filippo Santambrogio, Clément SARRAZIN
Several issues in machine learning and inverse problems require to generate discrete data, as if sampled from a model probabilitydistribution. A common way to do so relies on the construction of a uniform probability distribution over a set of $N$ points whichminimizes the Wasserstein distance to the model distribution. This minimization problem, where the unknowns are the positions of the atoms, is non-convex. Yet, in most cases, a suitably adjusted version of Lloyd's algorithm in which Voronoi cells are replaced by Power cells, leads to configurations with small Wasserstein error. This is surprising because, again, of the non-convex nature of the problem, which moreover admits spurious critical points. We provide explicit upper bounds for the convergence speed of this Lloyd-type algorithm, starting from a cloud of points sufficiently far from each other. This already works after one step of the iteration procedure, and similar bounds can be deduced, for the corresponding gradient descent. These bounds naturally lead to a sort of Poliak-Łojasiewicz inequality for the Wasserstein distance cost, with an error term depending on the distances between Dirac masses in the discrete distribution.