Configuration Guide¶
This document describes the OpenSage configuration system, including all configuration fields, their purposes, and how to write configuration files.
Overview¶
OpenSage uses TOML (Tom's Obvious, Minimal Language) format for configuration files. The configuration system supports:
- Template Variables: Use
${VAR_NAME}syntax for reusable values - Nested Sections: Organize related settings into logical groups
- Environment Variable Support: Template variables can reference environment variables
- Type Safety: Automatic conversion to Python dataclasses with type checking
Configuration File Location¶
Configuration files are loaded in the following order:
- Default Configuration:
src/<package>/templates/configs/default_config.toml(used when no config is specified) - Custom Configuration: Path specified via
config_pathparameter when creatingAigiseSession
Configuration Structure¶
The configuration is organized into several main sections:
# Top-level template variables (optional)
VARIABLE_NAME = "value"
# Root-level fields
task_name = "my_task"
src_dir_in_sandbox = "/shared/code"
default_host = "127.0.0.1"
auto_cleanup = true
# Section-based configuration
[neo4j]
# Neo4j database configuration
[sandbox]
# Sandbox configuration
[llm]
# LLM model configuration
[history]
# History and tool response configuration
[plugins]
# Plugin configuration
[agent_ensemble]
# Agent ensemble configuration
[build]
# Build and execution configuration
[mcp]
# Model Context Protocol services configuration
Template Variables¶
OpenSage supports template variable expansion using ${VAR_NAME} syntax.
Rules:¶
- Top-level UPPERCASE variables automatically become template variables
- Variables can be referenced anywhere using
${VAR_NAME} - Variables are expanded recursively throughout the configuration
- Undefined variables cause an error at load time
Example:¶
# Define template variables (UPPERCASE)
DEFAULT_IMAGE = "ubuntu:20.04"
MAIN_MODEL = "openai/gpt-4"
NEO4J_PASSWORD = "mypassword123"
# Use template variables
[sandbox.sandboxes.main]
image = "${DEFAULT_IMAGE}"
[llm.model_configs.main]
model_name = "${MAIN_MODEL}"
[neo4j]
password = "${NEO4J_PASSWORD}"
Configuration Sections¶
Root-Level Fields¶
These fields are defined at the top level of the configuration file:
| Field | Type | Description | Default |
|---|---|---|---|
task_name | string | Name identifier for the current task/session | None |
src_dir_in_sandbox | string | Path to source code directory within sandbox containers | "/shared/code" |
agent_storage_path | string | Path where dynamically created agents are stored | None |
default_host | string | Default hostname for services (used by Neo4j and MCP services) | None (falls back to 127.0.0.1) |
auto_cleanup | boolean | Whether to automatically cleanup resources when session ends | true |
Example:
task_name = "vulnerability_analysis"
src_dir_in_sandbox = "/shared/code"
agent_storage_path = "/tmp/agents"
default_host = "localhost"
auto_cleanup = true
Neo4j Configuration¶
Configures the Neo4j graph database connection.
Section: [neo4j]
Sandbox Images & Requirements (Practical Notes)¶
Some sandboxes require Python tooling inside their Docker images. In the default configuration template (src/<package>/templates/configs/default_config.toml):
sandbox.sandboxes.main- Built from
src/<package>/templates/dockerfiles/main/Dockerfile - Provides
python3via/app/.venv/bin/python -
Installs Python package
neo4j(used bysrc/<package>/sandbox/initializers/main.py) -
sandbox.sandboxes.joern - Built from
src/<package>/templates/dockerfiles/joern/Dockerfile - Provides
python3via/app/.venv/bin/python - Installs Python packages
httpxandwebsockets(used by Joern query helper scripts)
These images install Python deps using uv in the Dockerfile (create /app/.venv and run uv pip install ...), rather than at runtime inside a running container.
| Field | Type | Description | Default |
|---|---|---|---|
user | string | Neo4j username | None |
password | string | Neo4j password | None |
bolt_port | integer | Neo4j Bolt protocol port | 7687 |
neo4j_http_port | integer | Neo4j HTTP port | 7474 |
Note: The uri property is dynamically constructed as neo4j://{default_host}:{bolt_port}. If default_host is not set, it defaults to 127.0.0.1.
Example:
Sandbox Configuration¶
Configures sandbox environments (Docker containers or Kubernetes pods).
Section: [sandbox]
Top-Level Sandbox Settings¶
| Field | Type | Description | Default |
|---|---|---|---|
default_image | string | Default Docker image for sandboxes | None |
backend | string | Sandbox backend type: "native" (Docker) or "k8s" (Kubernetes) | "native" |
project_relative_shared_data_path | string | Path relative to project root for shared data (will be mounted as /shared in containers) | None |
absolute_shared_data_path | string | Absolute path for shared data | None |
tolerations | list[dict] | Kubernetes tolerations applied to all pods | None |
Per-Sandbox Configuration¶
Each sandbox type is configured under [sandbox.sandboxes.<sandbox_type>]:
Common Sandbox Types: - main: Primary analysis sandbox - joern: Joern static analysis sandbox - codeql: CodeQL analysis sandbox - neo4j: Neo4j database container - gdb_mcp: GDB debugger MCP service - pdb_mcp: PDB debugger MCP service - fuzz: Fuzzing environment
Container Configuration Fields:
| Field | Type | Description | Default |
|---|---|---|---|
image | string | Docker image name/tag | None |
container_id | string | Connect to existing container (instead of creating new) | None |
timeout | integer | Container operation timeout in seconds | 300 |
project_relative_dockerfile_path | string | Path to Dockerfile relative to project root | None |
absolute_dockerfile_path | string | Absolute path to Dockerfile | None |
command | string | Override container command (empty string = use Dockerfile default, None = use bash) | None |
platform | string | Platform architecture (e.g., "linux/amd64") | None |
network | string | Docker network name | None |
privileged | boolean | Run container in privileged mode | false |
security_opt | list[string] | Security options | [] |
cap_add | list[string] | Additional capabilities | [] |
gpus | string | GPU allocation (e.g., "all" or "device=GPU-UUID") | None |
shm_size | string | Shared memory size (e.g., "2g") | None |
mem_limit | string | Memory limit (e.g., "4g") | None |
cpus | string | CPU limit (e.g., "2") | None |
user | string | User to run as (e.g., "1000:1000") | None |
working_dir | string | Working directory in container | None |
Build Configuration:
| Field | Type | Description |
|---|---|---|
build_args | dict[string, string] | Docker build arguments |
using_cached | boolean | Whether to use cached image (internal flag) |
Environment, Volumes, and Ports:
| Field | Type | Description |
|---|---|---|
environment | dict[string, any] | Environment variables |
volumes | list[string] | Volume mounts in format "/host:/container:ro" |
mounts | list[string] | Docker mount specifications |
ports | dict[string, int\|string] | Port mappings in format {"port/tcp" = host_port} |
docker_args | list[string] | Raw arguments passed through to Docker CLI |
Extra Configuration:
| Field | Type | Description |
|---|---|---|
extra | dict[string, any] | Additional custom configuration (e.g., initializer_timeout_sec) |
Kubernetes-Specific Fields:
| Field | Type | Description |
|---|---|---|
pod_name | string | Connect to existing Pod instead of creating new |
container_name | string | Name of container within the Pod |
Example:
[sandbox]
backend = "native"
project_relative_shared_data_path = "data/my_project.tar.gz"
[sandbox.sandboxes.main]
image = "ubuntu:20.04"
project_relative_dockerfile_path = "dockerfiles/main/Dockerfile"
timeout = 300
[sandbox.sandboxes.main.build_args]
BASE_IMAGE = "ubuntu:20.04"
[sandbox.sandboxes.main.environment]
PYTHONPATH = "/shared/code"
[sandbox.sandboxes.main.ports]
"8080/tcp" = 8080
[sandbox.sandboxes.main.extra]
initializer_timeout_sec = 1800
[sandbox.sandboxes.joern]
image = "aigise/joern"
project_relative_dockerfile_path = "dockerfiles/joern/Dockerfile"
command = ""
[sandbox.sandboxes.joern.environment]
JAVA_OPTS = "-Xmx16G -Xms4G"
[sandbox.sandboxes.joern.ports]
"8081/tcp" = 18087
LLM Configuration¶
Configures language models used by agents.
Section: [llm]
Models are configured under [llm.model_configs.<model_name>]:
Common Model Names: - main: Primary model for agent reasoning - summarize: Model for summarization and context compression - flag_claims: Model for flag claims processing
Model Configuration Fields:
| Field | Type | Description | Default |
|---|---|---|---|
model_name | string | Model identifier (e.g., "openai/gpt-4", "anthropic/claude-3") | Required |
temperature | float | Sampling temperature (0.0-2.0) | None |
max_tokens | integer | Maximum tokens in response | None |
rpm | integer | Rate limit: requests per minute | None |
tpm | integer | Rate limit: tokens per minute | None |
Example:
[llm]
[llm.model_configs.main]
model_name = "openai/gpt-4"
temperature = 0.7
max_tokens = 4096
rpm = 60
tpm = 60000
[llm.model_configs.summarize]
model_name = "openai/gpt-3.5-turbo"
temperature = 0.3
max_tokens = 2048
rpm = 30
tpm = 30000
History Configuration¶
Configures tool response handling and event history management.
Section: [history]
| Field | Type | Description | Default |
|---|---|---|---|
max_tool_response_length | integer | Maximum length of a single tool response before special handling | 10000 |
enable_quota_countdown | boolean | Show remaining LLM call quota after each tool response | false |
Events Compaction Configuration:
Section: [history.events_compaction]
| Field | Type | Description | Default |
|---|---|---|---|
max_history_summary_length | integer | Character budget threshold for triggering compaction | 100000 |
compaction_percent | integer | Percentage of history to compress (0-100) | 50 |
Example:
[history]
max_tool_response_length = 10000
enable_quota_countdown = true
[history.events_compaction]
max_history_summary_length = 100000
compaction_percent = 50
Plugins Configuration¶
Configures which plugins are enabled.
Section: [plugins]
| Field | Type | Description | Default |
|---|---|---|---|
enabled | list[string] | List of enabled plugin names | [] |
Common Plugins: - history_summarizer_plugin: Summarizes long conversation history - tool_response_summarizer_plugin: Summarizes long tool responses - quota_after_tool_plugin: Shows quota countdown after tools
Example:
[plugins]
enabled = [
"history_summarizer_plugin",
"tool_response_summarizer_plugin",
"quota_after_tool_plugin",
]
Agent Ensemble Configuration¶
Configures multi-agent ensemble execution.
Section: [agent_ensemble]
| Field | Type | Description | Default |
|---|---|---|---|
thread_safe_tools | list[string] | List of tool names that are thread-safe (can be called in parallel) | [] |
available_models_for_ensemble | list[string] or string | List of model names available for ensemble (can be comma-separated string) | [] |
Example:
[agent_ensemble]
thread_safe_tools = ["google_search", "read_file"]
available_models_for_ensemble = ["openai/gpt-4", "anthropic/claude-3"]
Or as comma-separated string:
[agent_ensemble]
thread_safe_tools = ["google_search", "read_file"]
available_models_for_ensemble = "openai/gpt-4,anthropic/claude-3"
Build Configuration¶
Configures build and execution commands for target programs.
Section: [build]
| Field | Type | Description | Default |
|---|---|---|---|
poc_dir | string | Directory path for proof-of-concept code | None |
compile_command | string | Command to compile the target program | None |
run_command | string | Command to run the target program | None |
target_type | string | Type of target (e.g., "default", "binary") | None |
target_binary | string | Path to target binary | None |
Example:
[build]
poc_dir = "/tmp/poc"
compile_command = "gcc -o target target.c"
run_command = "./target"
target_type = "binary"
target_binary = "/tmp/poc/target"
MCP Configuration¶
Configures Model Context Protocol (MCP) services.
Section: [mcp]
MCP services are configured under [mcp.services.<service_name>]:
Common Service Names: - gdb_mcp: GDB debugger MCP service - pdb_mcp: PDB debugger MCP service
MCP Service Configuration Fields:
| Field | Type | Description |
|---|---|---|
sse_port | integer | Server-Sent Events (SSE) server port |
sse_host | string | SSE server host (if None, uses default_host from root config) |
Note: The sse_host property dynamically uses default_host from the root configuration if not explicitly set.
Example:
[mcp]
[mcp.services.gdb_mcp]
sse_port = 1111
[mcp.services.pdb_mcp]
sse_port = 1112
sse_host = "localhost" # Optional, defaults to root config's default_host
Complete Example¶
Here's a complete configuration file example:
# Template Variables
DEFAULT_IMAGE = "ubuntu:20.04"
MAIN_MODEL = "openai/gpt-4"
NEO4J_PASSWORD = "secure_password"
TASK_NAME = "security_analysis"
# Root Configuration
task_name = "${TASK_NAME}"
src_dir_in_sandbox = "/shared/code"
default_host = "localhost"
auto_cleanup = true
# Neo4j Configuration
[neo4j]
user = "neo4j"
password = "${NEO4J_PASSWORD}"
bolt_port = 7687
neo4j_http_port = 7474
# Sandbox Configuration
[sandbox]
backend = "native"
project_relative_shared_data_path = "data/project.tar.gz"
[sandbox.sandboxes.main]
image = "${DEFAULT_IMAGE}"
project_relative_dockerfile_path = "dockerfiles/main/Dockerfile"
timeout = 300
[sandbox.sandboxes.main.environment]
PYTHONPATH = "/shared/code"
[sandbox.sandboxes.joern]
image = "aigise/joern"
project_relative_dockerfile_path = "dockerfiles/joern/Dockerfile"
command = ""
[sandbox.sandboxes.joern.ports]
"8081/tcp" = 18087
# LLM Configuration
[llm]
[llm.model_configs.main]
model_name = "${MAIN_MODEL}"
temperature = 0.7
max_tokens = 4096
[llm.model_configs.summarize]
model_name = "${MAIN_MODEL}"
temperature = 0.3
max_tokens = 2048
# History Configuration
[history]
max_tool_response_length = 10000
enable_quota_countdown = true
[history.events_compaction]
max_history_summary_length = 100000
compaction_percent = 50
# Plugins Configuration
[plugins]
enabled = [
"history_summarizer_plugin",
"tool_response_summarizer_plugin",
]
# Agent Ensemble Configuration
[agent_ensemble]
thread_safe_tools = ["google_search"]
available_models_for_ensemble = "${MAIN_MODEL}"
# Build Configuration
[build]
compile_command = "make"
run_command = "./target"
# MCP Configuration
[mcp]
[mcp.services.gdb_mcp]
sse_port = 1111
Loading Configuration in Code¶
Using Default Configuration¶
from aigise.session import AigiseSession
# Uses default config from src/<package>/templates/configs/default_config.toml
session = AigiseSession(aigise_session_id="my_session")
Using Custom Configuration¶
from aigise.session import AigiseSession
# Load custom configuration file
session = AigiseSession(
aigise_session_id="my_session",
config_path="/path/to/my_config.toml"
)
Accessing Configuration¶
# Access configuration through session
config = session.config
# Access specific sections
neo4j_config = config.neo4j
sandbox_config = config.sandbox
llm_config = config.llm
# Access nested configurations
main_sandbox = config.get_sandbox_config("main")
main_model = config.get_llm_config("main")
Best Practices¶
- Use Template Variables: Define reusable values as UPPERCASE template variables at the top
- Organize by Section: Group related settings into logical sections
- Document Custom Fields: Add comments for non-standard or custom configuration
- Version Control: Keep configuration files in version control, but exclude sensitive values (passwords, API keys)
- Environment-Specific Configs: Create separate config files for development, testing, and production
- Validate Early: Test configuration files before deploying to catch errors early
Troubleshooting¶
Template Variable Not Found¶
If you see KeyError: Template variable 'VAR_NAME' not found, ensure: - The variable is defined as an UPPERCASE top-level variable - The variable name matches exactly (case-sensitive) - There are no typos in ${VAR_NAME} references
Configuration Not Loading¶
- Verify the TOML file syntax is correct
- Check file path is correct (use absolute paths if relative paths don't work)
- Ensure all required fields are present (check error messages)
Dynamic Host Resolution¶
If default_host is not set, services like Neo4j and MCP will default to 127.0.0.1. Set default_host at the root level for Kubernetes deployments or remote services.
Related Documentation¶
- Getting Started - Initial setup guide
- Architecture - System architecture overview
- Core Concepts - Core concepts including sessions
- Sandboxes - Sandbox backends and configuration guide