Models process information through numbers. Our technology provides visibility into these processes, allowing you to identify when the model is producing undesired outputs, such as hallucinations, faulty code, or harmful content.
Furthermore, Wisent enables you to directly modify these numerical parameters to improve the model's performance and output quality.
We show the model contrastive pairs—two strings that represent a specific behavior. These strings cause particular activations within the AI's system.
Neural activations that correspond to specific behaviors are mapped
These patterns are surgically edited to enhance desired capabilities.
Model produces outputs with the specifically enhanced capabilities.
The next step is to analyze this information to create either a classifier for reading the model's intent or a steering vector to actively control its behavior.
At inference time, there are even more factors to optimize. This involves determining if the steering should be conditional, finding the right steering strength, and deciding whether to apply it to all tokens or only a select few. The effectiveness of this process also depends on the specific trait being targeted, as some concepts are easier to steer than others—for example, making a model more positive is simpler than improving its code quality.
Discover the unique control and performance Wisent technology can deliver for your models.
Stay in the loop. Never miss out.
Subscribe to our newsletter and unlock Wisent insights.