Reviews: Using Embeddings to Correct for Unobserved Confounding in Networks

The reviewers agreed that the paper provides a novel and interesting way to approach a difficult problem: estimating causal effects in network data, under a homophily assumption. Two of the reviewers commended the clarity with which the authors approached the subject. Beyond praising the writing in general, the reviewers specifically commended the paper for being very explicit and clear about the assumptions needed for an embedding method to work in the context of causal inference in network data. One of the reviewers also noted that extending the estimation theory of double machine learning to the network, non-iid setting, is a good technical contribution. The reviewers were concerned about the following points: 1. The focus on homophily. 2. A technical point about linear-Gaussian models, which was corrected in the authors’ response. 3. The difficulty in using a black-box embedding model for adjusting. Since in causal inference one does not have a test-set, it is hard to know when is a model sufficient for obtaining the correct estimates. The proposed method relies on black-box embedding models, making it more difficult to develop confidence, or even intuition, as to how well they work. 4. Expanding on the previous point, the reviewers were concerned about Assumption 2: the assumption that the embedding model indeed captures all the confounding information. Specifically, they were concerned that this will be even harder to argue for compared with standard causal inference tasks. In standard causal inference, one hopes to adjust for all confounders directly. Knowing what are the confounders is mainly done via domain knowledge and statistical tests which might help gain intuition, though they can never be conclusive. The concern here is that using an embedding adds another layer of complexity to this process, and that there are insufficient proposals, even on the heuristic level, on how a practitioner could go about assessing the validity of this assumption and the strength of the embedding model in capturing confounding. Regarding point 1: I think focusing on homophily is 100% fine for a paper which is dealing with a new and difficult subject. Point 2 was fixed in the response. Regarding points 3 and 4: I strongly agree with Reviewer 4 here. I think this is a very difficult problem, and the solution the authors propose seems like one of the most promising directions currently available. As such, the paper is commendable for laying bare the crux of the difficulty, for which I do not see any immediate easy fix. The next step in my eyes is to have the discussion in the open community. The paper is a first step, offering both a method and a set of carefully constructed benchmarks. I expect other work to follow up on this, going deeper into what can be done to evaluate and gain confidence in embedding models as proxies to confounders.

Paper ID:	7684
Title:	Using Embeddings to Correct for Unobserved Confounding in Networks