Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The paper presents a new exploration strategy for decentralized MARL that is based on a joint latent variable that is shared between the agent. This paper is a difficult case. While the theoretical insights concerning the difficulty of the exploration problem in decentralized MARL are insightful, the experimental results were not good enough in the original submission to convince the reviewers. The algorithm was only in one case considerably better than the competitor QMix and other baseline comparison were missing. However, in the rebuttal the authors provided much better results as well as additional comparison to Qtrans. The reviewers increased their score accordingly, the paper is now in an acceptable state.