inference-config

Manage inference configuration settings for text generation. This command allows you to view, modify, and reset generation parameters like temperature, top-p, and max tokens.

Basic Usage

python -m wisent inference-config [show|set|reset] [OPTIONS]

Examples

Show Current Config

python -m wisent inference-config show

Set Temperature

python -m wisent inference-config set --temperature 0.7

Set Multiple Parameters

python -m wisent inference-config set \
  --temperature 0.8 \
  --top-p 0.95 \
  --max-new-tokens 256 \
  --do-sample true

Enable Thinking Mode (Qwen3)

python -m wisent inference-config set --enable-thinking true

Reset to Defaults

python -m wisent inference-config reset

Subcommands

Subcommand	Description
show	Display current inference configuration
set	Update inference configuration values
reset	Reset configuration to default values

Set Arguments

Argument	Type	Description
--do-sample	bool	Enable sampling (true/false)
--temperature	float	Sampling temperature (e.g., 0.7)
--top-p	float	Top-p (nucleus) sampling (e.g., 0.9)
--top-k	int	Top-k sampling (e.g., 50)
--max-new-tokens	int	Max new tokens to generate
--repetition-penalty	float	Repetition penalty (e.g., 1.0)
--no-repeat-ngram-size	int	No repeat n-gram size
--enable-thinking	bool	Enable thinking mode for Qwen3 models

Related Commands

generate-responses - Generate model responses
multi-steer - Interactive steering

Stay in the loop. Never miss out.

Subscribe to our newsletter and unlock Wisent insights.

Contact Careers Privacy Policy Terms of Service