Bibtek download is not available in the pre-proceeding
Zijian Kang, Peizhen Zhang, Xiangyu Zhang, Jian Sun, Nanning Zheng
Despite the success of Knowledge Distillation (KD) on image classification, it is still challenging to apply KD on object detection. Due to the uneven distribution of instance-related information, useful knowledge for detection is hard to locate. In this paper, we propose a conditional distillation framework to find the desired knowledge. Specifically, to retrieve useful information related to each target instance, we use the instance information to specify a condition. Given the condition, the retrieval process is conducted by a learnable conditional decoding module, guided by a localization-recognition-sensitive auxiliary task. During decoding, the condition information is encoded as query and the teacher's representation is presented as key. We use the attention between query and key to measure a correlation, which specifies the most related knowledge for distillation. Extensive experiments demonstrate the efficacy of our method: we observe impressive improvements under various settings. Notably, we boost RetinaNet with ResNet-50 backbone from $37.4$ to $40.7$ mAP ($+3.3$) under $1\times$ schedule, that even surpasses the teacher ($40.4$ mAP) with ResNet-101 backbone under $3\times$ schedule.