Part of Advances in Neural Information Processing Systems 35 (NeurIPS 2022) Main Conference Track
Shashank Srikant, Ben Lipkin, Anna Ivanova, Evelina Fedorenko, Una-May O'Reilly
What aspects of computer programs are represented by the human brain during comprehension? We leverage brain recordings derived from functional magnetic resonance imaging (fMRI) studies of programmers comprehending Python code to evaluate the properties and code-related information encoded in the neural signal. We first evaluate a selection of static and dynamic code properties, such as abstract syntax tree (AST)-related and runtime-related metrics. Then, to learn whether brain representations encode fine-grained information about computer programs, we train a probe to align brain recordings with representations learned by a suite of ML models. We find that both the Multiple Demand and Language systems--brain systems which are responsible for very different cognitive tasks, encode specific code properties and uniquely align with machine learned representations of code. These findings suggest at least two distinct neural mechanisms mediating computer program comprehension and evaluation, prompting the design of code model objectives that go beyond static language modeling.We make all the corresponding code, data, and analysis publicly available at https://github.com/ALFA-group/code-representations-ml-brain