Browser: Headless Chromium in a Hardware-Isolated microVM

hexr.browser gives your agents a full headless Chromium browser running inside a Firecracker microVM. You can navigate to any URL, interact with page elements, take screenshots, and extract text — all with the same hardware-level isolation as hexr.sandbox. Because the browser runs in a fresh VM per request with no access to your agent’s SPIFFE identity or cloud credentials, it is safe to use with LLM-generated URLs and instructions.

Quick start

import hexr.browser

result = hexr.browser.browse(
    "https://news.ycombinator.com",
    actions=[
        {"type": "extract_text", "selector": ".titleline > a"}
    ]
)

print(f"Page title: {result.title}")
for item in result.action_results:
    print(item)

API

hexr.browser.browse()

hexr.browser.browse(
    url: str,
    *,
    actions: list[dict] = None,
    timeout: int = 60,
    viewport_width: int = 1280,
    viewport_height: int = 720
) -> BrowseResult

url

string

required

The URL to navigate to.

actions

list[dict]

default:"None"

Sequence of browser actions to perform after the initial page load. See Actions for supported types.

timeout

int

default:"60"

Maximum total execution time in seconds.

viewport_width

int

default:"1280"

Browser viewport width in pixels.

viewport_height

int

default:"720"

Browser viewport height in pixels.

BrowseResult

result.url              # Final URL after any redirects
result.title            # Page title
result.text             # Full page text content
result.screenshot       # Base64-encoded PNG screenshot
result.action_results   # Results from each action in the sequence
result.duration_ms      # Total execution time in milliseconds

Actions

Action type	Fields	Description
`navigate`	`value` (URL)	Navigate to a new URL
`click`	`selector`	Click an element matching the CSS selector
`type`	`selector`, `value`	Type text into an input field
`screenshot`	—	Capture a full-page screenshot
`extract_text`	`selector`	Extract text from all matching elements
`wait`	`timeout` (ms)	Wait for a duration before the next action
`scroll`	`value` (pixels)	Scroll the page by a number of pixels

Examples

Web research agent

from hexr import hexr_agent, hexr_llm
import hexr.browser
import openai

@hexr_agent(name="web-researcher", tenant="acme-corp")
def research(topic: str) -> str:
    # Browse and extract content
    result = hexr.browser.browse(
        f"https://en.wikipedia.org/wiki/{topic}",
        actions=[
            {"type": "extract_text", "selector": "#mw-content-text p"}
        ]
    )
    
    # Analyze with LLM
    client = hexr_llm(openai.OpenAI())
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Summarize the following content"},
            {"role": "user", "content": result.text[:4000]}
        ]
    )
    
    return response.choices[0].message.content

Form submission

result = hexr.browser.browse(
    "https://example.com/login",
    actions=[
        {"type": "type", "selector": "#username", "value": "agent@hexr.dev"},
        {"type": "type", "selector": "#password", "value": "token-from-vault"},
        {"type": "click", "selector": "#submit"},
        {"type": "wait", "timeout": 3000},
        {"type": "screenshot"},
        {"type": "extract_text", "selector": ".dashboard-content"}
    ]
)

# Screenshot is base64-encoded PNG
import base64
with open("screenshot.png", "wb") as f:
    f.write(base64.b64decode(result.screenshot))

Visual analysis with GPT-4o

Pass a screenshot directly to a multimodal model for visual analysis:

result = hexr.browser.browse(
    "https://dashboard.example.com",
    actions=[{"type": "screenshot"}]
)

client = hexr_llm(openai.OpenAI())
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe what you see in this dashboard"},
            {"type": "image_url", "image_url": {
                "url": f"data:image/png;base64,{result.screenshot}"
            }}
        ]
    }]
)

Async version

result = await hexr.browser.browse_async(
    "https://example.com",
    actions=[{"type": "screenshot"}]
)

Security model

The browser runs inside the same Firecracker microVM environment as hexr.sandbox:

No SPIFFE identity inside the browser VM — it cannot impersonate your agent
No credential access — cloud APIs and Vault are unreachable from inside the VM
No cluster network — cannot reach Gateway, other agents, or internal services
Fresh VM per request — no persistent state, cookies, or session data carried between calls
Hardware isolation — KVM boundary, not just container namespaces

This is a managed browser running on your own Kubernetes infrastructure with Firecracker isolation — not a third-party cloud browser service. Your browsing data never leaves your cluster.

Documentation Index

​Quick start

​API

​hexr.browser.browse()

​BrowseResult

​Actions

​Examples

​Web research agent

​Form submission

​Visual analysis with GPT-4o

​Async version

​Security model

Quick start

API

hexr.browser.browse()

BrowseResult

Actions

Examples

Web research agent

Form submission

Visual analysis with GPT-4o

Async version

Security model