Claudia Shi, David Blei, Victor Veitch
This paper addresses the use of neural networks for the estimation of treatment effects from observational data. Generally, estimation proceeds in two stages. First, we ﬁt models for the expected outcome and the probability of treatment (propensity score). Second, we plug these ﬁtted models into a downstream estimator. Neural networks are a natural choice for the models in the ﬁrst step. Our question is: how can we adapt the design and training of the neural networks used in this ﬁrst step in order to improve the quality of the ﬁnal estimate of the treatment effect? We propose two adaptations based on insights from the statistical literature on the estimation of treatment effects. The ﬁrst is a new architecture, the Dragonnet, that exploits the sufﬁciency of the propensity score for estimation adjustment. The second is a regularization procedure, targeted regularization, that induces a bias towards models that have non-parametrically optimal asymptotic properties ‘out-of-the-box’. Studies on benchmark datasets for causal inference show these adaptations outperform existing methods.