HPR - Householder Pseudo-Rotation that applies rotation transformations to activation space using Householder matrices for more stable steering.
HPR (Householder Pseudo-Rotation) starts exactly like CAA by computing the difference between positive and negative activation averages, but then transforms this difference vector into a special mathematical object called a Householder matrix using the formula H = I - 2vv^T, where v is the normalized steering vector and I is the identity matrix.
During inference, instead of simply adding the vector like CAA, HPR applies this Householder matrix as a rotation transformation to the entire activation space, effectively rotating the AI's internal representations toward the desired behavior. The transformation is applied with a beta parameter that controls the rotation strength, and the matrix multiplication happens on flattened activations before reshaping them back, targeting the second-to-last token position.
This approach preserves the geometric structure of the activation space better than simple addition, making it more stable for larger steering strengths.
python -m wisent_guard.cli tasks confidence_pairs.json --from-json --steering-mode --steering-method HPR --layer 16 --save-steering-vector confidence_hpr.pt
python -m wisent_guard.cli tasks accuracy_pairs.json --from-json --steering-mode --steering-method HPR --layer 14 --hpr-beta 0.8 --save-steering-vector accuracy_hpr.pt
python -m wisent_guard.cli tasks test_scenarios.json --from-json --steering-mode --steering-method HPR --layer 16 --load-steering-vector confidence_hpr.pt --steering-strength 2.0
python -m wisent_guard.cli tasks creativity_pairs.json --from-json --steering-mode --steering-method HPR --layer 13 --enable-token-steering --token-steering-strategy exponential_decay --token-decay-rate 0.7
python -m wisent_guard.cli tasks ethics_pairs.json --from-json --steering-mode --steering-method HPR --layer 17 --device cuda --max-new-tokens 100 --save-steering-vector ethics_hpr.pt
--hpr-beta
: Rotation strength parameter (0.0-1.0, default 1.0)--enable-token-steering
: Enable position-based steering--token-steering-strategy
: last_only, second_to_last, first_only, all_equal, exponential_decay, exponential_growth, linear_decay, linear_growth--token-decay-rate
: Decay rate for exponential strategies (default 0.5)--token-min-strength
: Minimum strength for decay strategies (default 0.1)--token-max-strength
: Maximum strength for growth strategies (default 1.0)For the complete implementation of the HPR steering method in Wisent-Guard, see: