After the discussion all reviewers support acceptance, noting that the paper lays out a novel, clear, and general framework for safe online RL. This topic is very relevant to the NeurIPS community, and the paper should be disseminated. However, all reviewers expressed at least minor concerns. I strongly encourage the authors to consider this feedback so that they can improve the responses from future readers. In the discussion, it also became clear that a reviewer thought the paper described an agent interacting with *human* students - perhaps future clarifications can avoid this point of confusion.