check-linearity

Analyze the linearity of concept representations in model activations. This command determines whether a concept is linearly separable across layers, helping you choose the best steering method and identify optimal layers for modification.

Basic Usage

python -m wisent check-linearity PAIRS_FILE [OPTIONS]

Examples

Basic Linearity Check

python -m wisent check-linearity ./pairs/truthfulqa.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --verbose

With Specific Extraction Strategy

python -m wisent check-linearity ./pairs/helpfulness.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --extraction-strategy chat_last \
  --linear-threshold 0.8 \
  --output ./linearity_results.json

Analyze Specific Layers

python -m wisent check-linearity ./pairs/bias.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --layers 8,12,15,20 \
  --max-pairs 100 \
  --min-cohens-d 1.5

Full Geometry Detection

python -m wisent check-linearity ./pairs/refusal.json \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --optimization-steps 100 \
  --weak-threshold 0.6 \
  --verbose

Arguments

Required

Argument	Description
pairs_file	Path to JSON file containing contrastive pairs

Model Configuration

Argument	Default	Description
--model	Llama-3.2-1B-Instruct	Model to use for activation collection
--device	auto	Device to run model on (auto, cuda, mps, cpu)
--layers	auto-select	Comma-separated layer indices to test

Analysis Options

Argument	Default	Description
--extraction-strategy	all	Extraction strategy (or test multiple if not specified)
--max-pairs	50	Maximum number of pairs to use for analysis
--optimization-steps	50	Optimization steps for geometry detection

Thresholds

Argument	Default	Description
--linear-threshold	0.7	Score threshold to declare LINEAR
--weak-threshold	0.5	Score threshold to declare WEAKLY_LINEAR
--min-cohens-d	1.0	Minimum Cohen's d for meaningful separation

Output

Argument	Description
--output	Output file path for results JSON
--verbose	Show detailed results for all configurations

Linearity Classifications

LINEAR - Concept is linearly separable; use CAA or simple directional methods
WEAKLY_LINEAR - Some linear structure; consider PRISM or hyperplane methods
NON_LINEAR - Complex geometry; use TITAN or PULSE for multi-direction steering

Related Commands

diagnose-vectors - Diagnose steering vectors
diagnose-pairs - Diagnose contrastive pairs
geometry-search - Search for optimal geometry

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.

Contact Careers Privacy Policy Terms of Service