create-steering-vector

Create full steering objects from enriched contrastive pairs with activations. Unlike simpler vector generation, this command preserves method-specific components like gates, intensity networks, and multi-directional steering.

Basic Usage

python -m wisent create-steering-vector ENRICHED_PAIRS_FILE --output FILE [OPTIONS]

Examples

Basic CAA Steering Object

python -m wisent create-steering-vector \
  ./enriched_pairs/helpfulness.json \
  --method caa \
  --output ./steering/helpfulness_caa.pt

TITAN Steering Object

python -m wisent create-steering-vector \
  ./enriched_pairs/harmfulness.json \
  --method titan \
  --titan-num-directions 5 \
  --titan-max-alpha 3.0 \
  --output ./steering/harmfulness_titan.pt

PULSE Steering Object

python -m wisent create-steering-vector \
  ./enriched_pairs/honesty.json \
  --method pulse \
  --pulse-gate-temperature 0.1 \
  --output ./steering/honesty_pulse.pt

PRISM Multi-Directional

python -m wisent create-steering-vector \
  ./enriched_pairs/bias.json \
  --method prism \
  --prism-num-directions 3 \
  --prism-optimization-steps 100 \
  --output ./steering/bias_prism.pt

Arguments

Required Arguments

Argument	Description
enriched_pairs_file	Path to JSON file containing contrastive pairs with activations
--output	Output file path for steering object (.pt or .json)

Steering Method

Argument	Default	Description
--method	caa	Steering method: caa, hyperplane, mlp, prism, pulse, titan
--normalize	True	L2-normalize steering vectors
--no-normalize	-	Disable L2-normalization

MLP Parameters

Argument	Default	Description
--mlp-hidden-dim	256	Hidden dimension for MLP classifier
--mlp-num-layers	2	Number of hidden layers
--mlp-dropout	0.1	Dropout rate
--mlp-epochs	100	Training epochs
--mlp-learning-rate	0.001	Learning rate

Hyperplane Parameters

Argument	Default	Description
--hyperplane-max-iter	1000	Max iterations for logistic regression
--hyperplane-C	1.0	Regularization strength

PRISM Parameters

Argument	Default	Description
--prism-num-directions	3	Number of directions to discover per layer
--prism-optimization-steps	100	Optimization steps
--prism-learning-rate	0.01	Learning rate

PULSE Parameters

Argument	Default	Description
--pulse-sensor-layer	auto	Sensor layer index for gating
--pulse-condition-threshold	0.5	Condition threshold for gating
--pulse-gate-temperature	0.1	Gate temperature
--pulse-learn-threshold	True	Learn optimal threshold

TITAN Parameters

Argument	Default	Description
--titan-num-directions	5	Number of directions per layer
--titan-sensor-layer	auto	Sensor layer for gating
--titan-gate-hidden-dim	auto	Gate network hidden dimension
--titan-intensity-hidden-dim	auto	Intensity network hidden dimension
--titan-max-alpha	3.0	Maximum steering intensity
--titan-gate-temperature	0.5	Gate temperature

Display Options

Argument	Description
--verbose	Enable verbose output
--timing	Show timing information

Supported Methods

CAA - Contrastive Activation Addition (simple mean difference)
Hyperplane - Logistic regression-based separating hyperplane
MLP - Multi-layer perceptron classifier
PRISM - Multi-directional steering with multiple directions per layer
PULSE - Conditional gating with sensor layer detection
TITAN - Full adaptive steering with learned gate and intensity networks

Related Commands

generate-vector - Generate vectors from contrastive pairs
verify-steering - Verify steering alignment
multi-steer - Combine multiple steering objects

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.

Contact Careers Privacy Policy Terms of Service