# CLAVE: An Adaptive Framework for Evaluating Values of LLM Generated Responses

### Overview
This project includes the benchmark and source code for CLAVE. Existing LLM-based evaluators face two challenges in open-ended value evaluation: they should align with changing human value definitions with minimal annotation, against their own bias (*adaptability*), and detect varying value expressions and scenarios robustly (*generalizability*). To handle these challenges, we introduce **CLAVE**, a novel framework which integrates two complementary LLMs, a strong one to extract high-level value concepts from a few human labels, leveraging its extensive knowledge and generalizability, and a smaller one fine-tuned on such concepts to better align with human value understanding. Furthermore, we present **ValEval**, a comprehensive dataset for open-ended value assessment, comprising 13k+ (text,value,label) tuples across diverse domains and covering three major value systems.

### ValEval Benchmark
Three datasets from different value systems, i.e., social risk categories, Schwartz's Theory of Basic Values and Moral Foundation Theory are provided under the "./data" directory. Each corresponds to four files, i.e. train, test, test-perturbed, test-generalization.

### The CLAVE Framework
Before runing the framework, you need to put your OpenAI api_key in utils.py and models.py. Install necessary pip packages.
```
your_api_key = "xxx..."
pip install -r requirements.txt
```

#### Step 1. Value Concept Extraction
In this step, we extract value concepts for the training samples, construct value concepts pool, and obtain value concepts for the testing dataset.
```
cd src/
bash scripts/concept_extraction.sh
```

#### Step 2. Small Model Training
```
bash scripts/train_mistral.sh
```

#### Step 3. Inference
```
bash scripts/infer_mistral.sh
```