Analyze the linearity of concept representations in model activations. This command determines whether a concept is linearly separable across layers, helping you choose the best steering method and identify optimal layers for modification.
python -m wisent check-linearity PAIRS_FILE [OPTIONS]
python -m wisent check-linearity ./pairs/truthfulqa.json \ --model meta-llama/Llama-3.1-8B-Instruct \ --verbose
python -m wisent check-linearity ./pairs/helpfulness.json \ --model meta-llama/Llama-3.1-8B-Instruct \ --extraction-strategy chat_last \ --linear-threshold 0.8 \ --output ./linearity_results.json
python -m wisent check-linearity ./pairs/bias.json \ --model meta-llama/Llama-3.1-8B-Instruct \ --layers 8,12,15,20 \ --max-pairs 100 \ --min-cohens-d 1.5
python -m wisent check-linearity ./pairs/refusal.json \ --model meta-llama/Llama-3.1-8B-Instruct \ --optimization-steps 100 \ --weak-threshold 0.6 \ --verbose
| Argument | Description |
|---|---|
| pairs_file | Path to JSON file containing contrastive pairs |
| Argument | Default | Description |
|---|---|---|
| --model | Llama-3.2-1B-Instruct | Model to use for activation collection |
| --device | auto | Device to run model on (auto, cuda, mps, cpu) |
| --layers | auto-select | Comma-separated layer indices to test |
| Argument | Default | Description |
|---|---|---|
| --extraction-strategy | all | Extraction strategy (or test multiple if not specified) |
| --max-pairs | 50 | Maximum number of pairs to use for analysis |
| --optimization-steps | 50 | Optimization steps for geometry detection |
| Argument | Default | Description |
|---|---|---|
| --linear-threshold | 0.7 | Score threshold to declare LINEAR |
| --weak-threshold | 0.5 | Score threshold to declare WEAKLY_LINEAR |
| --min-cohens-d | 1.0 | Minimum Cohen's d for meaningful separation |
| Argument | Description |
|---|---|
| --output | Output file path for results JSON |
| --verbose | Show detailed results for all configurations |
Stay in the loop. Never miss out.
Subscribe to our newsletter and unlock Wisent insights.