NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:3918
Title:Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning


		
The reviewers felt that this paper was well-executed, even though the proposed approach is a rather straightforward application of techniques from the robust MDP literature (specifically, minmax planning with appropriately defined uncertainty sets derived from a Lipschitzness assumption). For the final version, the authors should improve the discussion of related literature on robust MDPs (e.g., "Reinforcement Learning in Robust Markov Decision Processes" by Lim et al., NIPS 2013 + references therein) and on MDPs with non-stationary transitions (e.g., "Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions" by Abbasi-Yadkori et al., NIPS 2013 + references therein).