A Python package for latent space monitoring and guardrails. Delivered to you by the Wisent team led by Lukasz Bartoszcze.
Wisent-Guard uses representation engineering to make your AI safer and more trustworthy. Unlock the true potential of your LLMModelA model is a set of weights used to generate responses. At the moment, Wisent only works with open source models. Each model has a distinct parameter size and special tokens to mark the beginning of the model response and user query. with layer-level control. With our tools, you can cut hallucinations by 43% and harmful responses by 88%. All through the power of controlling intermediate representationsRepresentationA high level concept embedded within the weights of the neural network. To be honest, the exact definition of what a representation is can be a bit difficult. It can be really wide, like a representation of hallucination or good coding ability. It can be pretty narrow like knowledge about a particular historical fact or being able to perform a particular task. Representations get acquired in training through process known as representation learning. Representation engineering however, focuses on observing and changing representations at runtime.- thoughts hidden deep inside the AI brain.
pip install wisent-guard
Run MMLU benchmark with classification:
python -m wisent_guard.cli tasks mmlu --model meta-llama/Llama-3.1-8B-Instruct --layer 15 --limit 10 --classifier-type logistic --verbose
Run HellaSwag benchmark with steering:
python -m wisent_guard.cli tasks hellaswag --model meta-llama/Llama-3.1-8B-Instruct --layer 15 --limit 5 --steering-mode --steering-strength 1.0 --verbose