End-to-end pipeline that generates steering vectors from synthetic contrastive pairs. This command generates pairs for a specified trait, collects activations, and creates steering vectors all in one step.
python -m wisent generate-vector-from-synthetic --trait TRAIT --output FILE [OPTIONS]
python -m wisent generate-vector-from-synthetic \ --trait "helpfulness" \ --model meta-llama/Llama-3.1-8B-Instruct \ --output ./vectors/helpfulness.json
python -m wisent generate-vector-from-synthetic \ --trait "responds with a formal academic tone" \ --model meta-llama/Llama-3.1-8B-Instruct \ --num-pairs 50 \ --similarity-threshold 0.85 \ --output ./vectors/formal_tone.json
python -m wisent generate-vector-from-synthetic \ --trait "toxicity" \ --model meta-llama/Llama-3.1-8B-Instruct \ --pairs-cache-dir ./cache/ \ --keep-intermediate \ --intermediate-dir ./intermediate/ \ --verbose \ --output ./vectors/toxicity.json
python -m wisent generate-vector-from-synthetic \ --trait "creativity" \ --model meta-llama/Llama-3.1-8B-Instruct \ --layers 8,12,16,20 \ --num-pairs 30 \ --output ./vectors/creativity.json
| Argument | Description |
|---|---|
| --trait | Trait to generate contrastive pairs for (e.g., 'helpfulness', 'toxicity') |
| --output | Output file path for the final steering vector (JSON) |
| Argument | Default | Description |
|---|---|---|
| --model | Llama-3.2-1B-Instruct | HuggingFace model name or path |
| --device | auto | Device to use (auto, cpu, cuda, mps) |
| Argument | Default | Description |
|---|---|---|
| --num-pairs | 20 | Number of contrastive pairs to generate |
| --similarity-threshold | 0.8 | Cosine similarity threshold for filtering pairs |
| Argument | Default | Description |
|---|---|---|
| --layers | all | Comma-separated layer indices or 'all' |
| --method | caa | Steering method to use |
| --normalize | true | L2-normalize steering vectors |
| Argument | Description |
|---|---|
| --keep-intermediate | Keep intermediate files (pairs and enriched pairs) |
| --intermediate-dir | Directory for intermediate files (default: same as output) |
| --pairs-cache-dir | Directory to cache/load pairs for reuse |
| --force-regenerate | Force regeneration even if cached pairs exist |
| Argument | Description |
|---|---|
| --accept-low-quality-vector | Accept vectors that fail quality checks (convergence, SNR) |
| --verbose | Enable verbose output |
| --timing | Show timing information |
Stay in the loop. Never miss out.
Subscribe to our newsletter and unlock Wisent insights.