generate-vector

Generate steering vectors from contrastive pairs or trait descriptions. Supports both single-property and multi-property steering vector creation.

Basic Usage

python -m wisent generate-vector --from-pairs FILE --output FILE [OPTIONS]

Examples

From Existing Pairs File

python -m wisent generate-vector \
  --from-pairs ./pairs/helpfulness.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --layer 15 \
  --output ./vectors/helpfulness.pt

From Trait Description

python -m wisent generate-vector \
  --from-description "responds more helpfully with detailed explanations" \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --layer 15 \
  --num-pairs 30 \
  --save-pairs ./pairs/generated.json \
  --output ./vectors/helpfulness.pt

Multi-Property Steering

python -m wisent generate-vector \
  --multi-property \
  --property-files "helpfulness:./pairs/helpful.json:15" \
  --property-files "honesty:./pairs/honest.json:15" \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --output ./vectors/multi_property.pt

Multi-Property from Descriptions

python -m wisent generate-vector \
  --multi-property \
  --property-descriptions "helpful:responds helpfully:15" \
  --property-descriptions "honest:admits uncertainty:15" \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --output ./vectors/multi_property.pt

Arguments

Pair Source (single property)

Argument	Description
--from-pairs	Path to JSON file containing contrastive pairs
--from-description	Natural language description of the trait

Multi-Property Options

Argument	Description
--multi-property	Enable multi-property steering
--property-files	Property definitions from files (format: name:file:layer)
--property-descriptions	Property definitions from descriptions (format: name:desc:layer)

Model & Method

Argument	Default	Description
--model	distilgpt2	Model name or path
--layer	0	Layer index to apply steering
--method	CAA	Steering method to use
--device	auto	Device to run on

Activation Extraction

Argument	Default	Description
--prompt-construction	multiple_choice	Strategy for constructing prompts
--token-targeting	choice_token	Strategy for targeting tokens

Output

Argument	Description
--output	Output path for steering vector (required)
--num-pairs	Number of pairs to generate when using --from-description (default: 30)
--save-pairs	Save generated pairs to file when using --from-description

Prompt Construction Strategies

multiple_choice - Formats prompts as multiple choice questions
role_playing - Uses role-playing scenarios
direct_completion - Direct text completion format
instruction_following - Instruction-based formatting

Token Targeting Strategies

choice_token - Target the token representing the choice
continuation_token - Target continuation tokens
last_token - Target the last token
first_token - Target the first token
mean_pooling - Average across all tokens
max_pooling - Maximum activation across tokens

Related Commands

generate-pairs - Generate contrastive pairs only
synthetic - End-to-end pair generation and vector training
multi-steer - Combine multiple vectors at inference

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.

Contact Careers Privacy Policy Terms of Service