This work proposed two methods for likelihood-based anomaly detection: a likelihood ratio approach based on comparing in-distribution likelihood against a general background distribution likelihood, and a hierarchical approach that exploits the final scale of a Glow network. The authors did a great job in experimentally verifying their hypothesis and proposed algorithms. The idea to contrast against a general background distribution is not new per se. Nevertheless, the authors convincingly demonstrated the effectiveness of this idea in deep anomaly detection based on modern generative models. The observation that low-level features dominate the likelihood is also interesting and potentially useful for future studies. The discovery on the multi-scale difference in Glow appears to be novel and appreciated by reviewers. The experiments were thoroughly performed and nicely presented. Overall, the paper is well-written. The identification of model bias and domain prior for likelihood-based deep anomaly detection are a timely and valuable contribution for the community. Please consider taking the reviewers' comments (e.g. discussing and possibly comparing to additional related works) into your revision. The following references (and references therein) on contrastive estimation (from the pre-DL era) are also worth citing and discussing: Ingo Steinwart, Don Hush, and Clint Scovel. A classification framework for anomaly detection. Journal of Machine Learning Research, 6:211–232, 2005. Wolfgang Polonik. Measuring Mass Concentrations and Estimating Density Contour Clusters: An Excess Mass Approach. Annals of Statistics, 23:855-881, 1995.