optimize-sample-size

Choose an ideal number of samples for classifier models or steering vectors. Run tests at various sample sizes until determining when performance improvements start slowing down.

Basic Usage
python -m wisent optimize-sample-size MODEL --task TASK --layer LAYER --token-aggregation METHOD [OPTIONS]

Examples

Classification Sample Size
python -m wisent optimize-sample-size \
  meta-llama/Llama-3.1-8B-Instruct \
  --task truthfulqa_mc1 \
  --layer 15 \
  --token-aggregation average \
  --sample-sizes 5 10 20 50 100 200 500 \
  --test-size 200 \
  --save-plot
Steering Mode Sample Size
python -m wisent optimize-sample-size \
  meta-llama/Llama-3.1-8B-Instruct \
  --task mmlu \
  --layer 15 \
  --token-aggregation average \
  --steering-mode \
  --steering-method CAA \
  --steering-strength 1.0 \
  --sample-sizes 10 25 50 100 200
Custom Sample Sizes
python -m wisent optimize-sample-size \
  meta-llama/Llama-3.1-8B-Instruct \
  --task hellaswag \
  --layer 15 \
  --token-aggregation final \
  --threshold 0.6 \
  --sample-sizes 20 40 60 80 100 150 200 \
  --seed 123

Arguments

Required

ArgumentDescription
modelModel name or path to optimize
--taskTask to optimize for (required)
--layerLayer index to use (required)
--token-aggregationToken aggregation method (required)

Classification Options

ArgumentDefaultDescription
--threshold0.5Detection threshold for classification

Steering Mode

ArgumentDefaultDescription
--steering-modefalseOptimize for steering instead of classification
--steering-methodCAASteering method to use
--steering-strength1.0Steering strength to use
--token-targeting-strategyLAST_TOKENToken targeting strategy for steering

Sample Size Settings

ArgumentDefaultDescription
--sample-sizes5 10 20 50 100 200 500Sample sizes to test
--test-size200Fixed test set size
--seed42Random seed for reproducibility
--limitNoneMaximum samples to load from dataset

Output

ArgumentDescription
--save-plotSave performance plot
--no-save-configDon't save optimal sample size to model config
--forceForce optimization without matching classifier parameters

Token Aggregation Methods

  • average - Average across all tokens
  • final - Use only the last token
  • first - Use only the first token
  • max - Maximum across all tokens
  • min - Minimum across all tokens
  • max_score - Use highest individual token score

Related Commands

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.