The reviewers agreed that this is a solid work, on an important problem for which existing results are scarce. However, there were several concerns: - The authors create some confusion in describing their method as "independent" - the agents have to coordinate the learning rates ahead of time. - The analysis is not very tight. I believe that these concerns actually open the door for interesting followup work, and therefore recommend acceptance. I ask the authors to tone down the independence claims in the final version, given the concern above.