evaluate

Evaluate a single prompt using a steering vector and output quality scores. Helpful for assessing effectiveness of steering in real time for distinct inputs.

Basic Usage

python -m wisent evaluate --vector FILE --prompt TEXT --model MODEL --trait NAME [OPTIONS]

Examples

Basic Evaluation

python -m wisent evaluate \
  --vector ./vectors/helpfulness.pt \
  --prompt "What is the best way to learn programming?" \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --trait helpfulness

With Custom Strength

python -m wisent evaluate \
  --vector ./vectors/cynical.pt \
  --prompt "What do you think about the future of AI?" \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --trait cynical \
  --trait-description "responds with cynical worldview" \
  --steering-strength 2.0

With Thresholds

python -m wisent evaluate \
  --vector ./vectors/honest.pt \
  --prompt "Tell me about your capabilities" \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --trait honest \
  --trait-threshold 0.5 \
  --answer-threshold 0.7

JSON Output

python -m wisent evaluate \
  --vector ./vectors/creative.pt \
  --prompt "Write a short story opening" \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --trait creative \
  --json

Arguments

Required

Argument	Description
--vector	Path to steering vector file (.pt)
--prompt	Prompt to evaluate
--model	Model name or path
--trait	Trait name (e.g., 'catholic', 'cynical')

Optional Configuration

Argument	Default	Description
--device	auto	Device to run on
--steering-strength	2.0	Steering strength to apply
--max-new-tokens	100	Maximum new tokens to generate
--trait-description	trait name	Optional description of the trait

Threshold Parameters

Argument	Description
--trait-threshold	Minimum trait quality threshold (-1 to 1 scale)
--answer-threshold	Minimum answer quality threshold (0 to 1 scale)

Output Options

Argument	Description
--verbose	Enable verbose output
--json	Output results as JSON

Output Scores

Trait Score - How well the response exhibits the target trait (-1 to 1)
Answer Quality - Overall quality of the answer (0 to 1)
Generated Response - The steered model output

Related Commands

generate-vector - Create steering vectors
multi-steer - Combine multiple vectors
tasks - Run batch evaluation tasks

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.

Contact Careers Privacy Policy Terms of Service