This paper provides a new perspective in thinking about episodic RL, and should be of interest to anyone working with MDPs in reinforcement learning. Three reviewers (R1, R2, R3) commented that it was well-written and clear, although R4 disagreed. All reviewers commented on the interesting contributions (proving that MDPs within episodic RL can be proven to be ergodic). R1, R2, and R3 had concerns that it was a mostly theoretical paper, and wondered how to practically apply these insights. However, the rebuttal goes some way to address these points, and R4 was convinced to raise their recommendation to weak accept. I think these kinds of more theoretical, analytical papers are pivotal toward increasing understanding of RL models and how they learn, and all reviewers agree it’s a very well-presented and motivated paper along these lines. I therefore recommend accept.