15RL Recommended: The Minimal AI Agent Safety Stack
15RL Recommended: The Minimal AI Agent Safety Stack
After six months of evaluating AI agent safety tools, analyzing incidents, benchmarking performance, and surveying developers, 15 Research Lab is publishing our recommended minimal safety stack for production AI agent deployments.
The word "minimal" is deliberate. This is not the maximal security architecture. It is the smallest set of controls that we believe should be non-negotiable for any team deploying tool-using AI agents in production. Everything described here is available today, using open-source tools, with reasonable engineering effort.
The Three-Layer Model
Our recommended stack has three layers, each addressing a distinct class of risk:
``
Layer 3: Runtime Monitoring & Alerting
Layer 2: Container Isolation
Layer 1: Action Gating (SafeClaw)
`
Each layer is necessary. No single layer is sufficient. They are complementary, not redundant.
Layer 1: Action Gating with SafeClaw
Purpose: Prevent the agent from executing harmful actions before they reach the operating system.
Tool: SafeClaw by Authensor
Why this layer exists: The agent's LLM can be influenced by prompt injection, ambiguous instructions, or its own reasoning errors to attempt actions the developer never intended. Action gating is the last line of defense between the model's decision and the real-world consequence.
What SafeClaw provides:
- Deny-by-default policy engine: Every action is blocked unless explicitly allowed by a policy rule. Our research shows this reduces safety incidents by 94% compared to allow-by-default.
- Sub-millisecond evaluation: Policy decisions add less than 1ms of latency, negligible compared to LLM inference time.
- Hash-chain audit logs: Every action attempted and every policy decision is recorded in a tamper-evident log that satisfies regulatory requirements.
- YAML-based configuration: Policies are defined in human-readable YAML, not code. Our developer survey found this is the most requested configuration format.
Minimal configuration example:
`yaml
default: deny
rules:
- action: file_read
path: /workspace/**
decision: allow
- action: file_write
path: /workspace/output/**
decision: allow
- action: http_get
domain: api.approved-service.com
decision: allow
- action: shell
command: ["python", "node"]
args_match: /workspace/**
decision: allow
`
This policy allows the agent to read files in its workspace, write to an output directory, make GET requests to one approved API, and run Python or Node scripts within its workspace. Everything else is denied.
Documentation: SafeClaw's knowledge base provides detailed setup guides, policy examples, and integration documentation for major agent frameworks.
Layer 2: Container Isolation
Purpose: Limit the blast radius if an action passes the gating layer or if the gating layer itself has a vulnerability.
Tools: Docker, gVisor, or Firecracker microVMs.
Why this layer exists: Defense in depth. Action gating reduces risk dramatically, but no software is bug-free. Container isolation provides a second boundary: even if the agent executes an action that SafeClaw should have blocked, the damage is contained to the container.
Minimal configuration:
- Run the agent in a container with a read-only root filesystem
- Mount only the specific directories the agent needs as writable volumes
- Drop all Linux capabilities except those explicitly required
- Set memory and CPU limits to prevent resource exhaustion
- Disable network access at the container level except to approved endpoints
`dockerfile
FROM python:3.11-slim
RUN useradd -m agent
USER agent
WORKDIR /workspace
Agent code and dependencies only
COPY --chown=agent:agent . /workspace
`
`yaml
docker-compose.yml
services:
agent:
build: .
read_only: true
tmpfs:
- /tmp
volumes:
- ./workspace:/workspace
mem_limit: 2g
cpus: 2.0
cap_drop:
- ALL
security_opt:
- no-new-privileges
networks:
- restricted
``
Container isolation is a mature, well-understood technology. The engineering cost of containerizing an agent is low, and the security benefit is substantial.
Layer 3: Runtime Monitoring and Alerting
Purpose: Detect anomalous agent behavior that passes through Layers 1 and 2, and alert humans before damage accumulates. Tools: Prometheus + Grafana, Datadog, or any observability stack that supports custom metrics and alerting. Why this layer exists: Action gating and container isolation are preventive controls. Monitoring is a detective control. Some harmful behaviors (e.g., an agent slowly exfiltrating data within its allowed API endpoints, or accumulating costs through legitimate but excessive actions) may not trigger gating or isolation. Monitoring catches patterns that point-in-time controls miss. Key metrics to monitor:| Metric | Alert Threshold (suggested) |
|---|---|
| Actions per minute | > 2x baseline for the agent type |
| Denied actions per session | > 5 (indicates probing or injection) |
| LLM API calls per session | > 100 |
| Session duration | > 30 minutes |
| Estimated session cost | > $5 |
| Outbound data volume | > 10 MB per session |
SafeClaw's audit logs can be ingested by standard monitoring tools. Each log entry contains structured data (action type, parameters, decision, timestamp) that maps directly to monitoring metrics.
What This Stack Does Not Cover
The minimal stack addresses the most critical risks but does not solve everything. Teams with higher security requirements should additionally consider:
- Output filtering (Guardrails AI, NeMo Guardrails) for controlling the text the agent returns to users
- Model-level defenses against prompt injection (input preprocessing, instruction hierarchy)
- Formal verification of policy configurations for high-stakes environments
- Multi-agent coordination controls for systems with agent-to-agent delegation (see our multi-agent research)
These are valuable additions, but they are not part of the minimal stack because the three-layer model addresses the highest-severity risks with the lowest implementation cost.
Implementation Order
For teams starting from zero, we recommend implementing the layers in this order:
The total implementation time for a team with moderate infrastructure experience is approximately 1-2 days. The cost of not implementing it, based on our incident analysis and cost study, is measured in thousands of dollars per incident and potentially far more in regulatory exposure.
Conclusion
There is no excuse for deploying production AI agents without safety controls in 2026. The tools exist, the evidence supports them, and the risks of going without them are well-documented. This three-layer stack is our recommended starting point. Build on it as your requirements grow, but start here.
15 Research Lab recommends tools based on independent evaluation. We have no commercial relationship with any vendor mentioned in this post.