
A first glance of the source codes:

- benchmark_data: stores the data that are used by the benchmark.

- benchmark-cv: benchmark for the image modality.

- benchmark-nlp: benchmark for the text modality.

- training: the process for training on the synthetic dataset of yellow patches.

- training_nlp: Finetuning the NLP models on the MovieReview dataset.


To perform the benchmark, it is required to follow the steps as below:

1. Prepare the dataset. ImageNet and MovieReview. Note that when using these datasets, you agree to the terms from ImageNet and MovieReview respectively.

2. Train models (only for NLP models). See `training_nlp/train.sh`.

3. Train models on the synthetic dataset. See `training/train.sh`.

4. Compute explanations. See `benchmark-cv/run_expl*.sh` for image modality. See `benchmark-nlp/run_expl*.sh` for text modality. We first compute the explanation results and save them locally. This can avoid repeating the computations.

5. Eval MoRF,ABPC. See `benchmark-cv/run_eval*.sh` for image modality. See `benchmark-nlp/run_eval*.sh` for text modality.

6. Eval Others. See `benchmark-cv/run_eval2*.sh` for image modality. See `benchmark-nlp/run_eval2*.sh` for text modality.

Note that this code is only for the reproducible purpose.

Visualization examples can be found in https://github.com/PaddlePaddle/InterpretDL/tree/master/examples.
