NeurIPS 2020

WoodFisher: Efficient Second-Order Approximation for Neural Network Compression

Meta Review

The focus of the submission is training neural networks using 2nd-order information. Particularly, the goal of the work is the approximation of the inverse of the empirical Fisher matrix as it is defined in the displayed equation under (1). The authors notice that the empirical Fisher is an average of diads (a x a^T where ^T denotes transposition) hence its inverse can be recursively computed by the Woodbury matrix identity. The resulting inverse is applied for pruning of convolutional neural networks (CNNs) and is compared against other unstructured pruning methods. Training and pruning neural networks are central problems of machine learning. While the mathematical contribution in the submission is somewhat limited (the Woodbury matrix identity is a quite standard approach in numerical analysis), the reviewers agreed that the empirical evaluation is thorough, the approach is useful and can have impact in the area of CNNs. In the final version of the paper, 1) a more rigorous comparison to K-FAC approximations is suggested to be carried out, 2) the novelty of the submission (=application in CNN pruning) should be more clearly emphasized.