Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The present paper proposes a variant of an attention mechanism (called a quasi-attention mechanism) that allows in a soft manner to add (+1), subtract (-1) or erase (×0) information from the attended input. This new feature gives to the mechanism the capability of learning negative correlations in addition to the usual positive and zero correlations. The definitions are mathematically sounded and the authors give a nice explanation of the underlying intuition. Experimentations on various tasks illustrate the behaviour of the mechanism and show that it also improves the model performance, even beating the state of the art on some datasets. This paper is a clear accept according to NeurIPS standards, however after discussion with the reviewer, I will not recommend it for a talk since it is basically a nice improvement in an already fully explored direction.