NeurIPS 2020

Constrained episodic reinforcement learning in concave-convex and knapsack settings


Meta Review

While it is true that constraints can typically be made part of the normal optimisation process in RL, by encapsulating them into the reward function, it can often be much easier to specify constraints directly, which is the setting this paper considers. The reviewers were positive about the motivation and execution of this paper, and were all in favour of accepting the paper. I would suggest already motivating this setting, at least somewhat, in the abstract, to help interesting readers find and appreciate this paper more easily.