Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
While we now understand how we can pretrain text encoders or non-conditional language models, the important open question is figuring out a method for pretraining (or using pretrained) decoders in seq2seq models. While several proposals have been made, neither was particularly successful. This paper does not deliver this either but it answers a very natural question anyone working on this problem would like to ask -- can we even steer a pretrained language models so that it generates a given sequence? I (as well as) reviewers found the paper (very) interesting: the study is well executed, well written and provides new insights into the properties and limitations of pretrained language models. There is consensus that the paper should be accepted.