optimize-steering

Different methods require tuning of steering parameters. Commands included for automated optimization, comparing methods, optimizing layers, strengthening optimization, and personal customization. To optimize steering parameters using various methods including commands for automation

Basic Usage
python -m wisent optimize-steering SUBCOMMAND MODEL [OPTIONS]

Subcommands

  • auto - Automatically optimize steering based on classification config
  • compare-methods - Compare different steering methods for a task
  • optimize-layer - Find optimal steering layer for a method
  • optimize-strength - Find optimal steering strength
  • comprehensive - Run comprehensive steering optimization
  • personalization - Optimize steering for personality/trait steering
  • multi-personalization - Joint optimization for multiple traits

Auto Optimization

Automatically refines steering control according to existing classifications configured.

Auto Optimization
python -m wisent optimize-steering auto \
  meta-llama/Llama-3.1-8B-Instruct \
  --methods CAA \
  --limit 100 \
  --max-time 60 \
  --strength-range 0.5 1.0 1.5 2.0
ArgumentDefaultDescription
--taskallSpecific task to optimize
--methodsCAASteering methods to test
--limit100Maximum samples for testing
--max-time60Maximum time in minutes
--strength-range0.5 1.0 1.5 2.0Steering strengths to test
--layer-range0-5Layer range to search

Compare Methods

Compare different steering methods for a specific task.

Compare Methods
python -m wisent optimize-steering compare-methods \
  meta-llama/Llama-3.1-8B-Instruct \
  --task truthfulqa_mc1 \
  --methods CAA \
  --limit 100

Optimize Layer

Find the optimal steering layer for a specific method.

Optimize Layer
python -m wisent optimize-steering optimize-layer \
  meta-llama/Llama-3.1-8B-Instruct \
  --task truthfulqa_mc1 \
  --method CAA \
  --layer-range 10-20 \
  --strength 1.0 \
  --limit 100

Optimize Strength

Find the optimal steering strength for a method.

Optimize Strength
python -m wisent optimize-steering optimize-strength \
  meta-llama/Llama-3.1-8B-Instruct \
  --task truthfulqa_mc1 \
  --method CAA \
  --layer 15 \
  --strength-range 0.1 2.0 \
  --strength-steps 10

Personalization Optimization

Optimize steering parameters for custom personality/trait steering.

Single Trait Personalization
python -m wisent optimize-steering personalization \
  meta-llama/Llama-3.1-8B-Instruct \
  --trait "evil villain personality" \
  --trait-name evil \
  --num-pairs 20 \
  --num-test-prompts 5 \
  --strength-range 0.5 5.0 \
  --output-dir ./personalization_optimization
Multi-Trait Personalization
python -m wisent optimize-steering multi-personalization \
  meta-llama/Llama-3.1-8B-Instruct \
  --trait "evil personality" \
  --trait "speaks with Italian accent" \
  --trait-name evil \
  --trait-name italian \
  --num-pairs 10 \
  --output-dir ./multi_personalization
ArgumentDefaultDescription
--traitrequiredTrait description to steer towards
--trait-nameautoShort name for the trait
--num-pairs20Number of synthetic pairs to generate
--num-test-prompts5Number of test prompts for evaluation
--layersallSpecific layers to test
--strength-range0.5 5.0Min and max steering strength
--num-strength-steps5Number of strength values to test
--output-dir./personalization_optimizationDirectory for results and vectors

Comprehensive Optimization

Run comprehensive steering optimization across multiple tasks and methods.

Comprehensive Optimization
python -m wisent optimize-steering comprehensive \
  meta-llama/Llama-3.1-8B-Instruct \
  --tasks truthfulqa_mc1 mmlu \
  --methods CAA \
  --limit 100 \
  --max-time-per-task 20

Related Commands

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.