Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Reviewers find adding DAE style regularization in trajectory optimization phase of model-based RL interesting and appreciate the writing and execution of the paper. Reviewers though expressed concerns regarding the novelty of the work (a straightforward application of existing method) and would like to see more experiments demonstrating the effectiveness of proposed method under different dynamic models. Connection to behavior cloning and off-policy learning in model-free cases should be of interest to discuss. Overall, reviewers lean toward accepting the paper, we thus decided to accept it as is. Please address reviewers' comments in your final draft.