Choose an ideal number of samples for classifier models or steering vectors. Run tests at various sample sizes until determining when performance improvements start slowing down.
python -m wisent optimize-sample-size MODEL --task TASK --layer LAYER --token-aggregation METHOD [OPTIONS]
python -m wisent optimize-sample-size \ meta-llama/Llama-3.1-8B-Instruct \ --task truthfulqa_mc1 \ --layer 15 \ --token-aggregation average \ --sample-sizes 5 10 20 50 100 200 500 \ --test-size 200 \ --save-plot
python -m wisent optimize-sample-size \ meta-llama/Llama-3.1-8B-Instruct \ --task mmlu \ --layer 15 \ --token-aggregation average \ --steering-mode \ --steering-method CAA \ --steering-strength 1.0 \ --sample-sizes 10 25 50 100 200
python -m wisent optimize-sample-size \ meta-llama/Llama-3.1-8B-Instruct \ --task hellaswag \ --layer 15 \ --token-aggregation final \ --threshold 0.6 \ --sample-sizes 20 40 60 80 100 150 200 \ --seed 123
| Argument | Description |
|---|---|
| model | Model name or path to optimize |
| --task | Task to optimize for (required) |
| --layer | Layer index to use (required) |
| --token-aggregation | Token aggregation method (required) |
| Argument | Default | Description |
|---|---|---|
| --threshold | 0.5 | Detection threshold for classification |
| Argument | Default | Description |
|---|---|---|
| --steering-mode | false | Optimize for steering instead of classification |
| --steering-method | CAA | Steering method to use |
| --steering-strength | 1.0 | Steering strength to use |
| --token-targeting-strategy | LAST_TOKEN | Token targeting strategy for steering |
| Argument | Default | Description |
|---|---|---|
| --sample-sizes | 5 10 20 50 100 200 500 | Sample sizes to test |
| --test-size | 200 | Fixed test set size |
| --seed | 42 | Random seed for reproducibility |
| --limit | None | Maximum samples to load from dataset |
| Argument | Description |
|---|---|
| --save-plot | Save performance plot |
| --no-save-config | Don't save optimal sample size to model config |
| --force | Force optimization without matching classifier parameters |
Stay in the loop. Never miss out.
Subscribe to our newsletter and unlock Wisent insights.