Sandbox: Hardware-Isolated Code Execution for AI Agents

hexr.sandbox lets your agents execute arbitrary code — LLM-generated scripts, data analysis, shell commands — inside a hardware-isolated Firecracker microVM. Each execution gets a fresh VM that is destroyed immediately after the code completes. The sandbox intentionally has no access to your agent’s SPIFFE identity, Vault secrets, or cloud credentials, so even if prompt injection leads to code execution, the blast radius is contained.

Quick start

import hexr.sandbox

result = hexr.sandbox.exec("""
import pandas as pd
df = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]})
print(df.describe())
""")

print(result.stdout)

Output:

              x         y
count  3.000000  3.000000
mean   2.000000  5.000000
std    1.000000  1.000000
min    1.000000  4.000000
25%    1.500000  4.500000
50%    2.000000  5.000000
75%    2.500000  5.500000
max    3.000000  6.000000

API

hexr.sandbox.exec()

hexr.sandbox.exec(
    code: str,
    *,
    language: str = "python",
    timeout: int = 30,
    env_vars: dict = None,
    packages: list[str] = None
) -> ExecResult

code

string

required

The code to execute inside the microVM.

language

string

default:"python"

Execution language. Accepts "python" or "shell".

timeout

int

default:"30"

Maximum execution time in seconds. The VM is forcibly killed after this limit.

env_vars

dict

default:"None"

Environment variables to set inside the VM.

result = hexr.sandbox.exec(
    "import os; print(os.environ['MY_VAR'])",
    env_vars={"MY_VAR": "hello"}
)

packages

list[str]

default:"None"

Python packages to install before execution. Packages are installed fresh inside the VM.

result = hexr.sandbox.exec(
    "import numpy; print(numpy.random.rand(3))",
    packages=["numpy"]
)

ExecResult

result = hexr.sandbox.exec("print('hello')")

result.stdout       # "hello\n"
result.stderr       # ""
result.exit_code    # 0
result.duration_ms  # 1234
result.ok           # True (exit_code == 0)
result.output       # Alias for stdout

Async version

result = await hexr.sandbox.exec_async("print('hello')")

Check availability

Use is_enabled() to guard sandbox calls in environments where the service may not be running:

if hexr.sandbox.is_enabled():
    result = hexr.sandbox.exec("print('sandbox available')")
else:
    print("Sandbox not available in this environment")

Examples

Data analysis

result = hexr.sandbox.exec("""
import pandas as pd
import json

data = [
    {"name": "Alice", "score": 92},
    {"name": "Bob", "score": 85},
    {"name": "Charlie", "score": 78}
]
df = pd.DataFrame(data)
print(json.dumps({
    "mean": df['score'].mean(),
    "median": df['score'].median(),
    "std": df['score'].std()
}))
""", packages=["pandas"])

import json
stats = json.loads(result.stdout)

Shell commands

result = hexr.sandbox.exec(
    "ls -la /tmp && whoami && cat /etc/os-release",
    language="shell"
)
print(result.stdout)

Error handling

result = hexr.sandbox.exec("1/0")  # ZeroDivisionError

if not result.ok:
    print(f"Exit code: {result.exit_code}")
    print(f"Error: {result.stderr}")

Security model

Code inside the sandbox runs in a Firecracker microVM with hardware-level isolation. It has no access to SPIFFE identity, cloud credentials, Vault secrets, or the Kubernetes cluster network.

Property	Detail
Isolation	Firecracker microVM (KVM-based) — hardware boundary
Network	No access to cluster services or internet (by default)
Identity	No SPIFFE socket mounted — code cannot impersonate the agent
Credentials	No cloud credentials available inside the VM
Lifecycle	Fresh VM per execution — no state persists between calls
Resource limits	Memory and CPU capped per execution

Even if sandboxed code is malicious (for example, prompt injection leads to unexpected code execution), it cannot:

Access Vault secrets
Call cloud APIs using agent credentials
Communicate with other agents
Read the SPIRE socket
Escape to the host

Architecture

Agent sends execute request

The agent container sends POST /execute in plaintext to the Envoy sidecar.

Envoy upgrades to mTLS

Envoy forwards POST /execute to the Sandbox Service (port 8092) over mTLS.

microVM boots and code runs

The Sandbox Service boots a Firecracker microVM and injects the code. The code executes inside the VM.

Results returned, VM destroyed

stdout, stderr, and exit_code are captured. The VM is immediately destroyed. The ExecResult is returned through Envoy to the agent.

Agent → Envoy (plaintext → mTLS) → Sandbox :8092 → Firecracker microVM → ExecResult → Agent

The sandbox is built on SmolVM, a thin Firecracker wrapper licensed under Apache-2.0.

​Quick start

​API

​hexr.sandbox.exec()

​ExecResult

​Async version

​Check availability

​Examples

​Data analysis

​Shell commands

​Error handling

​Security model

​Architecture

Quick start

API

hexr.sandbox.exec()

ExecResult

Async version

Check availability

Examples

Data analysis

Shell commands

Error handling

Security model

Architecture