Generate contrastive pairs from lm-eval benchmark tasks. This command extracts question-answer pairs where one answer is correct (positive) and one is incorrect (negative), creating training data for steering vectors.
python -m wisent generate-pairs-from-task TASK_NAME --output FILE [OPTIONS]
python -m wisent generate-pairs-from-task truthfulqa_mc1 \ --output ./pairs/truthfulqa.json \ --limit 100
python -m wisent generate-pairs-from-task hellaswag \ --output ./pairs/hellaswag.json \ --seed 123 \ --verbose
| Argument | Default | Description |
|---|---|---|
| task_name | required | Name of the lm-eval task (e.g., truthfulqa_mc1, hellaswag) |
| --output | required | Output file path for the generated pairs (JSON format) |
| --limit | all | Maximum number of pairs to generate |
| --seed | 42 | Random seed for reproducibility |
| --verbose | false | Enable verbose logging |
Any task from the lm-evaluation-harness that has multiple choice answers can be used. Common tasks include:
Stay in the loop. Never miss out.
Subscribe to our newsletter and unlock Wisent insights.