generate-vector

Generate steering vectors from contrastive pairs or trait descriptions. Supports both single-property and multi-property steering vector creation.

Basic Usage
python -m wisent generate-vector --from-pairs FILE --output FILE [OPTIONS]

Examples

From Existing Pairs File
python -m wisent generate-vector \
  --from-pairs ./pairs/helpfulness.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --layer 15 \
  --output ./vectors/helpfulness.pt
From Trait Description
python -m wisent generate-vector \
  --from-description "responds more helpfully with detailed explanations" \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --layer 15 \
  --num-pairs 30 \
  --save-pairs ./pairs/generated.json \
  --output ./vectors/helpfulness.pt
Multi-Property Steering
python -m wisent generate-vector \
  --multi-property \
  --property-files "helpfulness:./pairs/helpful.json:15" \
  --property-files "honesty:./pairs/honest.json:15" \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --output ./vectors/multi_property.pt
Multi-Property from Descriptions
python -m wisent generate-vector \
  --multi-property \
  --property-descriptions "helpful:responds helpfully:15" \
  --property-descriptions "honest:admits uncertainty:15" \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --output ./vectors/multi_property.pt

Arguments

Pair Source (single property)

ArgumentDescription
--from-pairsPath to JSON file containing contrastive pairs
--from-descriptionNatural language description of the trait

Multi-Property Options

ArgumentDescription
--multi-propertyEnable multi-property steering
--property-filesProperty definitions from files (format: name:file:layer)
--property-descriptionsProperty definitions from descriptions (format: name:desc:layer)

Model & Method

ArgumentDefaultDescription
--modeldistilgpt2Model name or path
--layer0Layer index to apply steering
--methodCAASteering method to use
--deviceautoDevice to run on

Activation Extraction

ArgumentDefaultDescription
--prompt-constructionmultiple_choiceStrategy for constructing prompts
--token-targetingchoice_tokenStrategy for targeting tokens

Output

ArgumentDescription
--outputOutput path for steering vector (required)
--num-pairsNumber of pairs to generate when using --from-description (default: 30)
--save-pairsSave generated pairs to file when using --from-description

Prompt Construction Strategies

  • multiple_choice - Formats prompts as multiple choice questions
  • role_playing - Uses role-playing scenarios
  • direct_completion - Direct text completion format
  • instruction_following - Instruction-based formatting

Token Targeting Strategies

  • choice_token - Target the token representing the choice
  • continuation_token - Target continuation tokens
  • last_token - Target the last token
  • first_token - Target the first token
  • mean_pooling - Average across all tokens
  • max_pooling - Maximum activation across tokens

Related Commands

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.