Extract model activations from contrastive pairs. This command loads pairs, runs them through the model, and saves the layer activations for use in steering vector generation.
python -m wisent get-activations PAIRS_FILE --output FILE [OPTIONS]
python -m wisent get-activations ./pairs/truthfulqa.json \ --output ./activations/truthfulqa_enriched.json \ --model meta-llama/Llama-3.1-8B-Instruct \ --layers 8,12,15
python -m wisent get-activations ./pairs/helpfulness.json \ --output ./activations/helpfulness_enriched.json \ --model meta-llama/Llama-3.1-8B-Instruct \ --extraction-strategy chat_last \ --layers all
python -m wisent get-activations ./pairs/bias.json \ --output ./activations/bias_raw.json \ --model meta-llama/Llama-3.1-8B-Instruct \ --raw \ --verbose
| Argument | Default | Description |
|---|---|---|
| pairs_file | required | Path to JSON file containing contrastive pairs |
| --output | required | Output file path for pairs with activations (JSON) |
| Argument | Default | Description |
|---|---|---|
| --model | Llama-3.2-1B-Instruct | Model identifier (HuggingFace model name or path) |
| --device | auto | Device to run on (cuda, cpu, mps) |
| --layers | all | Comma-separated layer indices or 'all' |
| Argument | Default | Description |
|---|---|---|
| --extraction-strategy | chat_mean | How to extract activation vectors from tokens |
| --raw | false | Output raw hidden states [seq_len, hidden_size] |
| --batch-size | 1 | Batch size for processing |
| --limit | all | Maximum number of pairs to process |
Chat models:
Base models:
Stay in the loop. Never miss out.
Subscribe to our newsletter and unlock Wisent insights.