# CDN

Code for the paper "Mining the Benefits of Two-stage and One-stage HOI Detection".

## Installation
Installl the dependencies.
```
pip install -r requirements.txt
```

## Data preparation

Download HICO-DET dataset and unpack the tarball (`hico_20160224_det.tar.gz`) to the `data` directory.

Instead of using the original annotations files, we use the annotation files provided by the PPDM. The downloaded annotation files have to be placed as follows.
```
data
 └─ hico_20160224_det
     |─ annotations
     |   |─ trainval_hico.json
     |   |─ test_hico.json
     |   └─ corre_hico.npy
     :
```


## Pre-trained model
Download the pretrained model of DETR detector for ResNet-50, and put it to the `params` directory.
```
python tools/convert_parameters_hico.py \
        --load_path params/detr-r50-e632da11.pth \
        --save_path params/detr-r50-pre-2stage-q64.pth
```

## Training
After the preparation, you can start the training with the following command. The training of CDN-S on HICO-DET is shown as an example.
```
python -m torch.distributed.launch \
        --nproc_per_node=8 \
        --use_env \
        main.py \
        --pretrained params/detr-r50-pre-2stage-q64.pth \
        --output_dir logs \
        --dataset_file hico \
        --hoi_path data/hico_20160224_det \
        --num_obj_classes 80 \
        --num_verb_classes 117 \
        --backbone resnet50 \
        --set_cost_bbox 2.5 \
        --set_cost_giou 1 \
        --bbox_loss_coef 2.5 \
        --giou_loss_coef 1 \
        --num_queries 64 \
        --dec_layers_hopd 3 \
        --dec_layers_interaction 3 \
        --use_matching \
        --epochs 90 \
        --lr_drop 60
```
Decoupling dynamic re-weighting:
```
python -m torch.distributed.launch \
        --nproc_per_node=8 \
        --use_env \
        main.py \
        --pretrained logs/checkpoint.pth \
        --output_dir logs/ \
        --dataset_file hico \
        --hoi_path data/hico_20160224_det \
        --num_obj_classes 80 \
        --num_verb_classes 117 \
        --backbone resnet50 \
        --set_cost_bbox 2.5 \
        --set_cost_giou 1 \
        --bbox_loss_coef 2.5 \
        --giou_loss_coef 1 \
        --num_queries 64 \
        --dec_layers_hopd 3 \
        --dec_layers_interaction 3 \
        --epochs 10 \
        --freeze_mode 1 \
        --obj_reweight \
        --verb_reweight \
        --queue_size 4704 \
        --p_obj 0.7 \
        --p_verb 0.7 \
        --lr 1e-5 \
        --lr_backbone 1e-6
```

## Evaluation

You can conduct the evaluation with trained parameters as follows.
```
python -m torch.distributed.launch \
        --nproc_per_node=8 \
        --use_env \
        main.py \
        --pretrained logs/checkpoint.pth \
        --dataset_file hico \
        --hoi_path data/hico_20160224_det \
        --num_obj_classes 80 \
        --num_verb_classes 117 \
        --backbone resnet50 \
        --num_queries 64 \
        --dec_layers_hopd 3 \
        --dec_layers_interaction 3 \
        --eval \
        --use_nms_filter \
        --thres_nms 0.7 \
        --nms_alpha 1 \
        --nms_beta 0.5
```

## Results
HICO-DET.
|| Full (D) | Rare (D) | Non-rare (D) | Full(KO) | Rare (KO) | Non-rare (KO) |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: |
|CDN-S (ResNet50)| 31.44 | 27.39 | 32.64 | 34.09 | 29.63 | 35.42 |
|CDN-B (ResNet50)| 31.78 | 27.55 | 33.05 | 34.53 | 29.73 | 35.96 |
|CDN-L (ResNet101)| 32.07 | 27.19 | 33.53 | 34.79 | 29.48 | 36.38 |

D: Default, KO: Known object

