## Paper

VeXKD: The Versatile Integration of Cross-Modal Fusion and Knowledge Distillation for 3D Perception

## methods

![PWC](figures/overall_pipeline.png)
![PWC](figures/fusion_module.png)
![PWC](figures/masKD.png)

## Requirments

The version of MMDetection3D we use is [v1.1.0](https://github.com/open-mmlab/mmdetection3d/releases/tag/v1.1.0) and the version of MMRazor is [v1.0.0](https://github.com/open-mmlab/mmrazor/releases/tag/v1.0.0)
Please follow the instruction on the website and the requirements of them can be found in mmdetection3d/requirements.txt and mmrazor/requirements.txt.

After installing these dependencies, please run this command in the respective path to install the codebase:
```bash
python setup.py develop
```

## Data preparation

Please follow the instructions from [here](https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/datasets/nuscenes_det.md) to download and preprocess the nuScenes dataset. Please remember to download both detection dataset and the map extension (for BEV map segmentation). After data preparation, you will be able to see the following directory structure (as is indicated in mmdetection3d):

```
mmdetection3d
├── mmdet3d
├── tools
├── configs
├── data
│   ├── nuscenes
│   │   ├── maps
│   │   ├── samples
│   │   ├── sweeps
│   │   ├── v1.0-test
|   |   ├── v1.0-trainval
│   │   ├── nuscenes_database
│   │   ├── nuscenes_infos_train.pkl
│   │   ├── nuscenes_infos_val.pkl
│   │   ├── nuscenes_infos_test.pkl
│   │   ├── nuscenes_dbinfos_train.pkl

```

and run

```
python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes
```

in mmdetection3d folder, also put a soft link to the ./mmrazor/data/nuscenes to run the KD code.

## Evaluation

### Evaluation of the fusion teacher model

```
bash mmdetection3d/tools/dist_test.sh [fusion model config file_path] [gpu_nums] [trained_model_file]
```

For example

```
bash mmdetection3d/tools/dist_test.sh mmdetection3d/configs/vexkd/teacher_bevfusion_mgfm_detection.py 8 --cfg-options load_from=trained_models/pretrained_teacher_detection.pth
```

### Evaluation of the trained KD models

```
bash mmdetection3d/tools/dist_test.sh [single-modal model config file path] [gpu_nums] [trained_model_file]
```

For example

```
bash mmdetection3d/tools/dist_test.sh mmdetection3d/configs/vexkd/student_lidar_centerpoint_epoch20_detection.py 8 --cfg-options load_from=trained_models/pretrained_student_centerpoint_epoch20_detection.pth
```

### Training of the fusion teacher model

```
bash mmdetection3d/tools/dist_train.sh [fusion model config file_path] [gpu_nums] --cfg-options load_from=[pretrained_lidaronly_file]
```

For example

```
bash mmdetection3d/tools/dist_train.sh mmdetection3d/configs/vexkd/teacher_bevfusion_mgfm_detection.py 8 --cfg-options load_from=mmdetection3d/trained_models/bev_lidar_only_epoch20.pth 
```

### Training of the KD models

```
bash mmrazor/tools/dist_train.sh [kd config file_path] [gpu_nums]
```

For example

```
bash mmrazor/tools/dist_train.sh mmrazor/configs/distill/mmdet3d/BEVQueryGuide/bevqueryguide_deformable_teacher_centerpoint_lidar_student_detection_attn_transfer.py 8 
```

## Our Key implementations
### Modality General Fusion Module
* Add the support to BEV map segmentation task:
    - mmdetection3d/mmdet3d/datasets/transforms/formating.py
    - mmdetection3d/mmdet3d/datasets/transforms/loading.py
    - mmdetection3d/mmdet3d/evaluation/metrics/nuscenes_map_metric.py

* Add the support to bevformer:
    - mmdetection3d/mmdet3d/datasets/transforms/transforms_3d.py
    - mmdetection3d/mmdet3d/models/bevformer

* The modification of bevfusion and implementation of MGFM module:
    - mmdetection3d/mmdet3d/models/bevfusion/ops

* The config files that can be trained and validated:
    - mmdetection3d/configs/vexkd

### Versatile and Effective KD framework
* The implementation of BEV Query-guided Mask Generation Network:
    - mmrazor/models/algorithms/distill/configurable/bevquery_guided_cascade_mask_teacher_assist.py
    - mmrazor/models/distillers/bevquery_guided_multilayer_distiller.py

* The implementation of masked attention transfer loss:
    - mmrazor/models/losses/bev_init_query_guided_deformable_teacher_reconstructed_loss.py
    - mmrazor/models/losses/bev_query_guided_deformable_teacher_nochannel_multi_layer_atten_loss.py

* Control hooks that can stop mask learning process:
    - mmrazor/engine/hooks/stop_mask_learning_epoch_hook.py
    - mmrazor/engine/hooks/stop_mask_learning_iter_hook.py
    
* The config file of running KD training:
    - configs/distill/mmdet3d/BEVQueryGuide/

## Acknowledgements

Our model is based on [mmdetection3d](https://github.com/open-mmlab/mmdetection3d) and [mmrazor](https://github.com/open-mmlab/mmrazor/releases/tag/v1.0.0). It is also greatly inspired by the following outstanding contributions to the open-source community:[BEVFusion](https://github.com/mit-han-lab/bevfusion), and [bevformer](https://github.com/fundamentalvision/BEVFormer).