check-linearity

Analyze the linearity of concept representations in model activations. This command determines whether a concept is linearly separable across layers, helping you choose the best steering method and identify optimal layers for modification.

Basic Usage
python -m wisent check-linearity PAIRS_FILE [OPTIONS]

Examples

Basic Linearity Check
python -m wisent check-linearity ./pairs/truthfulqa.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --verbose
With Specific Extraction Strategy
python -m wisent check-linearity ./pairs/helpfulness.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --extraction-strategy chat_last \
  --linear-threshold 0.8 \
  --output ./linearity_results.json
Analyze Specific Layers
python -m wisent check-linearity ./pairs/bias.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --layers 8,12,15,20 \
  --max-pairs 100 \
  --min-cohens-d 1.5
Full Geometry Detection
python -m wisent check-linearity ./pairs/refusal.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --optimization-steps 100 \
  --weak-threshold 0.6 \
  --verbose

Arguments

Required

ArgumentDescription
pairs_filePath to JSON file containing contrastive pairs

Model Configuration

ArgumentDefaultDescription
--modelLlama-3.2-1B-InstructModel to use for activation collection
--deviceautoDevice to run model on (auto, cuda, mps, cpu)
--layersauto-selectComma-separated layer indices to test

Analysis Options

ArgumentDefaultDescription
--extraction-strategyallExtraction strategy (or test multiple if not specified)
--max-pairs50Maximum number of pairs to use for analysis
--optimization-steps50Optimization steps for geometry detection

Thresholds

ArgumentDefaultDescription
--linear-threshold0.7Score threshold to declare LINEAR
--weak-threshold0.5Score threshold to declare WEAKLY_LINEAR
--min-cohens-d1.0Minimum Cohen's d for meaningful separation

Output

ArgumentDescription
--outputOutput file path for results JSON
--verboseShow detailed results for all configurations

Linearity Classifications

  • LINEAR - Concept is linearly separable; use CAA or simple directional methods
  • WEAKLY_LINEAR - Some linear structure; consider PRISM or hyperplane methods
  • NON_LINEAR - Complex geometry; use TITAN or PULSE for multi-direction steering

Related Commands

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.