*PROS: proposes to use universal quantization in both training and testing phase, which avoids the train-test mismatch issue, topic is useful in practice with many applications, clearly relevant contributions stated in the rebuttal *CONS: One of the major motivations is that the proposed method avoids the train-test mismatch, but from the experiments it performs worse. Lack of Exploration of the bias in the gradients introduced by performing the approximation at eq. 19 Meta-reviewer recommendations: The paper is borderlinle but I recommend acceptance. The authors presented a Very good rebuttal clearly stating the usefulness of the paper. I recommend the authors to try to explain why performance in the experiments is worse than approahes based on hard quantisation: Balle et al 2017 (UN+Q). My impression is that other techniques in the literature has not been as tuned as Balle's method, but this is only speculation. I also recommend the authors to note the reviewers' comments that the authors should soften the initial statement (in abstract/intro) of the contributions since there is no strong evidence that "universal quantization has the potential to lead to much bigger improvements in the future."