Review for NeurIPS paper: Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies

NeurIPS 2020

Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies

Meta Review

The paper is well written with solid theoretical contributions. Reviewers are also happy with the rebuttal. The common concern is that the paper lacks experimental evaluation.