Python API

The Wisent Python API provides a high-level interface for steering AI models using representation engineering. It supports multiple modalities including text, audio, video, and robotics.

Installation

Install Wisent

pip install wisent

Basic Usage

The main process includes building an instance of Wisent, appending contrast pairs, training steerable vectors and producing results after applying steering.

Basic Text Steering Example

from wisent import Wisent

# Create a Wisent instance for text/LLM steering
wisent = Wisent.for_text("meta-llama/Llama-3-8B-Instruct")

# Add contrastive pairs for a trait
wisent.add_pair(
    positive="I'd be happy to help you with that.",
    negative="I refuse to help with that.",
    trait="helpfulness"
)

# Train steering vectors
wisent.train()

# Generate with steering applied
response = wisent.generate(
    "How do I cook pasta?",
    steer={"helpfulness": 1.5}
)

Factory Methods

Wisent provides factory methods for different modalities:

Method	Use Case
`Wisent.for_text(model_name)`	LLMs and text generation models
`Wisent.for_audio(model_name)`	Audio/speech models (e.g., Whisper)
`Wisent.for_video(model_name)`	Video understanding models
`Wisent.for_robotics(model)`	Robotics policy networks
`Wisent.for_multimodal(model_name)`	Vision-language models (VLMs)

Core Methods

Method	Description
`add_pair(positive, negative, trait)`	Add a contrastive pair for a trait
`add_pairs(pairs, trait)`	Add multiple contrastive pairs at once
`train(traits, layers, aggregation)`	Train steering vectors from stored pairs
`generate(content, steer)`	Generate output with optional steering
`save_vectors(path)`	Save trained steering vectors to file
`load_vectors(path)`	Load trained steering vectors from file

Multiple Traits

You can define and combine multiple steering traits:

Multiple Traits Example

from wisent import Wisent

wisent = Wisent.for_text("meta-llama/Llama-3-8B-Instruct")

# Add pairs for different traits
wisent.add_pair(
    positive="Let me explain this clearly...",
    negative="I guess maybe...",
    trait="confidence"
)

wisent.add_pair(
    positive="That's a great question!",
    negative="Ugh, another question...",
    trait="friendliness"
)

# Train all traits
wisent.train()

# Apply multiple traits with different strengths
response = wisent.generate(
    "Explain quantum computing",
    steer={
        "confidence": 1.5,
        "friendliness": 0.8
    }
)

Saving & Loading Vectors

Save trained steering vectors to reuse them later:

Persistence Example

# Save trained vectors
wisent.save_vectors("my_steering_vectors.pt")

# Later, load them back
wisent = Wisent.for_text("meta-llama/Llama-3-8B-Instruct")
wisent.load_vectors("my_steering_vectors.pt")

# Use immediately without retraining
response = wisent.generate(
    "Your prompt here",
    steer={"helpfulness": 1.0}
)

TraitConfig

The TraitConfig dataclass stores configuration for each steering trait:

TraitConfig Structure

from wisent import TraitConfig

# TraitConfig attributes:
# - name: str              # Unique identifier for the trait
# - description: str       # Human-readable description
# - steering_vectors       # Per-layer steering vectors (set after training)
# - default_scale: float   # Default steering strength (default: 1.0)
# - layers: List[str]      # Which layers to apply to

# Access trait info
trait_info = wisent.get_trait_info("helpfulness")
print(f"Trait: {trait_info.name}")
print(f"Default scale: {trait_info.default_scale}")

Introspection

Inspect the Wisent instance state:

Introspection Methods

# Check defined traits
print(wisent.traits)  # ['helpfulness', 'confidence', ...]

# Check if trained
print(wisent.is_trained)  # True/False

# Get available intervention points (layers)
print(wisent.get_intervention_points())

# Get recommended layers for steering
print(wisent.get_recommended_layers())

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.

Contact Careers Privacy Policy Terms of Service