The paper models neural network activation statistics with a Gaussian mixture model with an unknown number of components. The model is used to investigate the complexity of representations through the KL divergence between the max entropy reference and the model posterior. The reviewers generally felt the paper made a variety of interesting observations. Congratulations on the nice work. In a final version, the authors are encouraged to read and account for updates to reviewer comments after the rebuttal, and to discuss https://arxiv.org/abs/2002.08791, which provides a complementary Bayesian nonparametric perspective on deep neural networks.