Steering

Steering is the process of influencing the activations of a particular model so that it generates tokens that are better aligned with the user's preferences.

Steering Methods

Implementation Details

For a complete understanding of how steering works in Wisent-Guard, including the full implementation of activation modification, hook management, and steering strategies, explore the source code:

View steering.py on GitHub