Skip to main content
hexr.browser gives your agents a full headless Chromium browser running inside a Firecracker microVM. You can navigate to any URL, interact with page elements, take screenshots, and extract text — all with the same hardware-level isolation as hexr.sandbox. Because the browser runs in a fresh VM per request with no access to your agent’s SPIFFE identity or cloud credentials, it is safe to use with LLM-generated URLs and instructions.

Quick start

import hexr.browser

result = hexr.browser.browse(
    "https://news.ycombinator.com",
    actions=[
        {"type": "extract_text", "selector": ".titleline > a"}
    ]
)

print(f"Page title: {result.title}")
for item in result.action_results:
    print(item)

API

hexr.browser.browse()

hexr.browser.browse(
    url: str,
    *,
    actions: list[dict] = None,
    timeout: int = 60,
    viewport_width: int = 1280,
    viewport_height: int = 720
) -> BrowseResult
url
string
required
The URL to navigate to.
actions
list[dict]
default:"None"
Sequence of browser actions to perform after the initial page load. See Actions for supported types.
timeout
int
default:"60"
Maximum total execution time in seconds.
viewport_width
int
default:"1280"
Browser viewport width in pixels.
viewport_height
int
default:"720"
Browser viewport height in pixels.

BrowseResult

result.url              # Final URL after any redirects
result.title            # Page title
result.text             # Full page text content
result.screenshot       # Base64-encoded PNG screenshot
result.action_results   # Results from each action in the sequence
result.duration_ms      # Total execution time in milliseconds

Actions

Action typeFieldsDescription
navigatevalue (URL)Navigate to a new URL
clickselectorClick an element matching the CSS selector
typeselector, valueType text into an input field
screenshotCapture a full-page screenshot
extract_textselectorExtract text from all matching elements
waittimeout (ms)Wait for a duration before the next action
scrollvalue (pixels)Scroll the page by a number of pixels

Examples

Web research agent

from hexr import hexr_agent, hexr_llm
import hexr.browser
import openai

@hexr_agent(name="web-researcher", tenant="acme-corp")
def research(topic: str) -> str:
    # Browse and extract content
    result = hexr.browser.browse(
        f"https://en.wikipedia.org/wiki/{topic}",
        actions=[
            {"type": "extract_text", "selector": "#mw-content-text p"}
        ]
    )
    
    # Analyze with LLM
    client = hexr_llm(openai.OpenAI())
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Summarize the following content"},
            {"role": "user", "content": result.text[:4000]}
        ]
    )
    
    return response.choices[0].message.content

Form submission

result = hexr.browser.browse(
    "https://example.com/login",
    actions=[
        {"type": "type", "selector": "#username", "value": "agent@hexr.dev"},
        {"type": "type", "selector": "#password", "value": "token-from-vault"},
        {"type": "click", "selector": "#submit"},
        {"type": "wait", "timeout": 3000},
        {"type": "screenshot"},
        {"type": "extract_text", "selector": ".dashboard-content"}
    ]
)

# Screenshot is base64-encoded PNG
import base64
with open("screenshot.png", "wb") as f:
    f.write(base64.b64decode(result.screenshot))

Visual analysis with GPT-4o

Pass a screenshot directly to a multimodal model for visual analysis:
result = hexr.browser.browse(
    "https://dashboard.example.com",
    actions=[{"type": "screenshot"}]
)

client = hexr_llm(openai.OpenAI())
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe what you see in this dashboard"},
            {"type": "image_url", "image_url": {
                "url": f"data:image/png;base64,{result.screenshot}"
            }}
        ]
    }]
)

Async version

result = await hexr.browser.browse_async(
    "https://example.com",
    actions=[{"type": "screenshot"}]
)

Security model

The browser runs inside the same Firecracker microVM environment as hexr.sandbox:
  • No SPIFFE identity inside the browser VM — it cannot impersonate your agent
  • No credential access — cloud APIs and Vault are unreachable from inside the VM
  • No cluster network — cannot reach Gateway, other agents, or internal services
  • Fresh VM per request — no persistent state, cookies, or session data carried between calls
  • Hardware isolation — KVM boundary, not just container namespaces
This is a managed browser running on your own Kubernetes infrastructure with Firecracker isolation — not a third-party cloud browser service. Your browsing data never leaves your cluster.