An improved estimator of Variance Explained in the presence of noise

Part of Advances in Neural Information Processing Systems 21 (NIPS 2008)

Bibtex Metadata Paper


Ralf Haefner, Bruce Cumming


A crucial part of developing mathematical models of how the brain works is the quantification of their success. One of the most widely-used metrics yields the percentage of the variance in the data that is explained by the model. Unfortunately, this metric is biased due to the intrinsic variability in the data. This variability is in principle unexplainable by the model. We derive a simple analytical modification of the traditional formula that significantly improves its accuracy (as measured by bias) with similar or better precision (as measured by mean-square error) in estimating the true underlying Variance Explained by the model class. Our estimator advances on previous work by a) accounting for the uncertainty in the noise estimate, b) accounting for overfitting due to free model parameters mitigating the need for a separate validation data set and c) adding a conditioning term. We apply our new estimator to binocular disparity tuning curves of a set of macaque V1 neurons and find that on a population level almost all of the variance unexplained by Gabor functions is attributable to noise.