Run Untrusted Code Safely in a Hexr Sandbox

AI agents frequently need to run code they’ve generated — for data analysis, calculations, or web scraping. Executing that code directly on the host is dangerous: a malformed or malicious script can escape the process, access the network, or consume unbounded resources. Hexr Sandbox runs each execution in a Firecracker microVM that is created fresh, given no network access, and destroyed immediately after the code finishes. This guide shows you how to use hexr.sandbox.exec for both direct code execution and the common LLM-generate-then-execute pattern.

Steps

Execute code in a microVM

Import exec from hexr.sandbox and pass a code string. The result includes stdout, stderr, and exit code:

data_analyst.py

from hexr import hexr_agent
from hexr.sandbox import exec

@hexr_agent(name="data-analyst", tenant="acme-corp")
def main():
    result = exec("""
import pandas as pd
import numpy as np

data = pd.DataFrame({
    'revenue': [100, 200, 150, 300, 250],
    'costs': [80, 150, 100, 200, 180],
})
data['profit'] = data['revenue'] - data['costs']
print(data.describe().to_string())
print(f"\\nTotal profit: ${data['profit'].sum()}")
""")

    print(result.stdout)

Expected output

         revenue  costs  profit
count     5.00   5.00    5.00
mean    200.00 142.00   58.00
...
Total profit: $290

Combine with an LLM to generate and execute code

The most powerful pattern: ask an LLM to generate the code, then execute it safely in a sandboxed microVM:

code_agent.py

from hexr import hexr_agent, hexr_llm
from hexr.sandbox import exec

@hexr_agent(name="code-agent", tenant="acme-corp")
def main():
    # Ask LLM to generate analysis code
    code = hexr_llm(
        provider="openai",
        model="gpt-4o",
        prompt="Write Python code to calculate the first 20 Fibonacci numbers",
    )

    # Execute safely in a microVM
    result = exec(code, language="python", timeout=10)

    if result.exit_code == 0:
        print(f"Output: {result.stdout}")
    else:
        print(f"Error: {result.stderr}")

Choose a language

The sandbox supports Python, JavaScript, and Bash:

multi_language.py

# Python
result = exec("print('Hello from Python')", language="python")

# JavaScript
result = exec("console.log('Hello from Node.js')", language="javascript")

# Shell
result = exec("echo 'Hello from Bash' && uname -a", language="bash")

Set resource limits

Control how much CPU time and memory the execution can use:

resource_limits.py

result = exec(
    code="...",
    language="python",
    timeout=60,       # Max 60 seconds
    memory_mb=512,    # Max 512MB RAM
)

Security guarantees

Each execution runs in a dedicated Firecracker microVM that is created fresh and destroyed immediately after the code exits. The microVM has no network access and a read-only root filesystem.

Threat	Protection
Code escapes to host	Firecracker KVM isolation
Network exfiltration	No network access by default
Disk persistence	Read-only rootfs, destroyed after execution
Resource exhaustion	CPU, memory, and time limits enforced
Cross-agent interference	Separate microVM per execution

The sandbox has no network access by default. If your generated code attempts to make outbound HTTP requests, it will fail. This is intentional — it prevents data exfiltration from LLM-generated code.

Next steps

LLM observability

Trace the LLM calls that generate sandbox code alongside the execution results.

Browser agent

Combine sandboxed code execution with browser automation for richer data pipelines.

Agent-to-agent communication

Delegate code generation to a specialized agent, then execute the result safely.

SDK reference

Full reference for hexr.sandbox.exec and supported runtimes.

​Steps

​Security guarantees

​Next steps

LLM observability

Browser agent

Agent-to-agent communication

SDK reference

Steps

Security guarantees

Next steps