The reviewers had a lengthy debate about the contribution of the work presented by the authors, in particular its relation with mutual information maximisation. It seems that the proposed framework and algorithm are significantly different from prior art, which was identified as sufficiently novel for an acceptance at NeurIPS 2020. Note that it is important for the success of the conference that the authors address the criticisms mentioned in the reviews, and in particular: - clarify why the proposed objective is good at improving generalisation capabilities. - Make are clear comparison with the GMI paper, e.g. by showing that the definition of mutual information is not the same since it is a weighted combination of MI in the GMI papers (weights=sigmoid(dot product of latent representation), while in this paper, the MI is the original definition of Cover&Thomas (1991) - The fact that the approach can be viewed as an extension of GAT can make it less relevant, so a clear synthesis of its differences would be welcome.