Threat Model: AI Agent Attacks and Hexr Mitigations

AI agents introduce attack vectors that traditional application security doesn’t address. An agent that can call cloud APIs, execute code, and communicate with other agents creates a larger attack surface than a stateless web service. Hexr’s security architecture is designed specifically around these agent-specific threats — with multiple mitigations for each attack chain so that no single failure exposes your systems.

Agent-specific threats

1. Prompt injection → credential theft

The attack: Adversarial content in a retrieved document, web page, or tool response tricks your LLM into calling hexr_tool() with attacker-controlled parameters, attempting to exfiltrate data or escalate privileges. How Hexr protects you:

LLM Guard scans all prompts before they reach the LLM, blocking known injection patterns
OPA policies restrict which services each agent role can access — even a successful injection can only reach the agent’s authorized services
Credential scoping means a manipulated agent cannot access services outside its declared resources list
Short-lived credentials — any credential obtained through a successful attack expires in 15 minutes

2. Tool confusion → unauthorized access

The attack: An agent is manipulated into calling the wrong tool or accessing a resource outside its intended scope. How Hexr protects you:

SPIFFE identity — OPA verifies the specific process identity for every call, not just the pod or tenant
Gateway validation — tool calls are validated against the agent’s registered capabilities before execution
Audit trail — every tool call is logged with the full SPIFFE context, so you can trace exactly what was called and why

3. Agent-to-agent manipulation

The attack: A compromised agent sends malicious tasks to other agents in your system, attempting to propagate the compromise or exfiltrate data through a trusted channel. How Hexr protects you:

mTLS — all agent-to-agent communication requires mutual TLS with SPIFFE certificates; an attacker cannot fake another agent’s identity
OPA policies — you can control which agents are permitted to communicate with which other agents
Task validation — the A2A sidecar validates message schema before forwarding tasks to the target agent

4. Secret exfiltration

The attack: Agent code or LLM-generated output leaks stored secrets, API keys, or sensitive data. How Hexr protects you:

Vault scoping — secrets are stored per-agent and per-role; an agent cannot read secrets that belong to a different process identity
PII scanner — LLM Guard catches secrets and sensitive data in LLM responses before they reach your agent’s output
No environment variables — secrets are never injected as env vars, eliminating a common leakage vector

5. Code execution escape

The attack: LLM-generated code executed by your agent escapes its execution environment and accesses the host system or other tenants. How Hexr protects you:

Firecracker microVM — LLM-generated code runs with hardware-level isolation in a Firecracker microVM, not in a container or subprocess
No network — the sandbox has no outbound connectivity; generated code cannot exfiltrate data over the network
Resource limits — CPU and memory limits prevent resource exhaustion attacks
Destroyed after use — the microVM is destroyed after each execution, with no persistent state

Threat matrix

Threat	OWASP GenAI category	Hexr security layer	Severity
Prompt injection	LLM01	LLM Guard	Critical
Credential theft	LLM07	Credential scoping + OPA	Critical
Data exfiltration	LLM06	PII scanner + vault scoping	High
Agent impersonation	—	SPIFFE mTLS	High
Cross-tenant data access	—	Kubernetes namespace isolation	High
Code execution escape	—	Firecracker sandbox	High
Denial of service	—	OPA rate limiting	Medium
Model manipulation	LLM03	Separate LLM Guard service	Medium

The OWASP Top 10 for LLM Applications categorizes the most common risks in LLM-based systems. Hexr’s multi-layer architecture addresses all ten categories — see the compliance frameworks page for the complete mapping.

​Agent-specific threats

​1. Prompt injection → credential theft

​2. Tool confusion → unauthorized access

​3. Agent-to-agent manipulation

​4. Secret exfiltration

​5. Code execution escape

​Threat matrix

Agent-specific threats

1. Prompt injection → credential theft

2. Tool confusion → unauthorized access

3. Agent-to-agent manipulation

4. Secret exfiltration

5. Code execution escape

Threat matrix