get-activations

Extract model activations from contrastive pairs. This command loads pairs, runs them through the model, and saves the layer activations for use in steering vector generation.

Basic Usage
python -m wisent get-activations PAIRS_FILE --output FILE [OPTIONS]

Examples

Basic Activation Extraction
python -m wisent get-activations ./pairs/truthfulqa.json \
  --output ./activations/truthfulqa_enriched.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --layers 8,12,15
With Extraction Strategy
python -m wisent get-activations ./pairs/helpfulness.json \
  --output ./activations/helpfulness_enriched.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --extraction-strategy chat_last \
  --layers all
Raw Hidden States
python -m wisent get-activations ./pairs/bias.json \
  --output ./activations/bias_raw.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --raw \
  --verbose

Arguments

Input/Output

ArgumentDefaultDescription
pairs_filerequiredPath to JSON file containing contrastive pairs
--outputrequiredOutput file path for pairs with activations (JSON)

Model Configuration

ArgumentDefaultDescription
--modelLlama-3.2-1B-InstructModel identifier (HuggingFace model name or path)
--deviceautoDevice to run on (cuda, cpu, mps)
--layersallComma-separated layer indices or 'all'

Extraction Options

ArgumentDefaultDescription
--extraction-strategychat_meanHow to extract activation vectors from tokens
--rawfalseOutput raw hidden states [seq_len, hidden_size]
--batch-size1Batch size for processing
--limitallMaximum number of pairs to process

Extraction Strategies

Chat models:

  • chat_mean - Average across all response tokens
  • chat_first - First response token
  • chat_last - Last response token
  • chat_max_norm - Token with maximum norm
  • chat_weighted - Weighted average by position
  • role_play - Role-playing extraction
  • mc_balanced - Balanced multiple choice

Base models:

  • completion_last - Last token of completion
  • completion_mean - Mean of completion tokens
  • mc_completion - Multiple choice completion

Related Commands

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.