Cluster benchmarks by their activation similarity to identify groups of tasks that share similar representations. This helps in understanding which benchmarks might benefit from shared steering vectors and in designing efficient multi-task optimization strategies.
python -m wisent cluster-benchmarks --model MODEL [OPTIONS]
python -m wisent cluster-benchmarks \ --model meta-llama/Llama-3.1-8B-Instruct \ --output ./cluster_results/
python -m wisent cluster-benchmarks \ --model meta-llama/Llama-3.1-8B-Instruct \ --pairs-per-benchmark 100 \ --output ./detailed_clusters/
python -m wisent cluster-benchmarks \ --model meta-llama/Llama-3.1-8B-Instruct \ --device cuda \ --pairs-per-benchmark 50 \ --output ./gpu_clusters/
| Argument | Default | Description |
|---|---|---|
| --model | required | Model name or path (e.g., meta-llama/Llama-3.2-1B-Instruct) |
| --output | ./cluster_output | Output directory for results |
| --pairs-per-benchmark | 50 | Number of contrastive pairs per benchmark |
| --device | auto | Device to use (cuda/mps/cpu). Auto-detected if not specified. |
The command generates:
Stay in the loop. Never miss out.
Subscribe to our newsletter and unlock Wisent insights.