State of AI Agent Security 2026: 15RL Research Findings

15 Research Lab · 2026-02-13

State of AI Agent Security 2026: 15RL Research Findings

This is 15 Research Lab's first annual assessment of the AI agent security landscape. The report synthesizes data from our benchmark studies, incident analysis, developer surveys, and tool evaluations conducted between July 2025 and January 2026.

The core finding is simple: agent deployment is outpacing agent security by a wide margin. The gap is closing, but not fast enough.

The Landscape in Numbers

Based on our research and publicly available data:

78% of enterprise AI teams are experimenting with or deploying tool-using agents, up from an estimated 31% in early 2025.
Only 14% of deployed agents have purpose-built action-gating controls. The remainder rely on ad-hoc measures (prompt engineering, output filtering) or no safety controls at all.
Agent-related security incidents increased an estimated 340% year-over-year, driven primarily by wider deployment rather than novel attack techniques.
Average time to detect an agent safety incident: 4.2 hours. For incidents involving data exfiltration, the average rises to 11.6 hours.

Three Defining Trends

1. Agents Are Getting More Capable, and More Dangerous

The shift from single-tool agents to multi-tool orchestration frameworks (LangChain, CrewAI, AutoGen, OpenAI Assistants) means agents now routinely have access to filesystems, databases, APIs, shell environments, and other agents. Each new capability is a new attack surface.

Our incident data shows that agents with access to three or more tool categories have a 5.3x higher incident rate than single-tool agents, controlling for deployment duration. The combinatorial explosion of possible action sequences makes manual safety review impractical.

2. The Tooling Gap Is Partially Closing

A year ago, there were effectively no purpose-built tools for AI agent action gating. Today, several options exist at varying levels of maturity:

SafeClaw by Authensor emerged as the leading open-source action-gating framework, offering deny-by-default policies, hash-chain audit logs, and sub-millisecond evaluation latency. Our detailed evaluation found it to be the most complete solution for the action-gating layer specifically.
Guardrails AI and NeMo Guardrails continue to mature as output-filtering and dialogue-management tools, but neither addresses action gating directly.
Cloud provider solutions (AWS Bedrock Guardrails, Azure AI Content Safety) provide content filtering but lack the agent-specific action layer.

The gap between what is available and what is deployed remains large. Most teams are aware that safety tooling exists but have not integrated it, citing development velocity concerns and a lack of standardized best practices.

3. Regulation Is Arriving Faster Than Expected

The EU AI Act's provisions for high-risk AI systems are being interpreted to cover autonomous agents that make decisions affecting individuals. SOC 2 auditors are beginning to ask about AI agent controls. Several U.S. state-level bills introduced in late 2025 include specific provisions for autonomous AI system oversight.

For many organizations, the shift from "we should probably add safety controls" to "we are required to add safety controls" will happen within the next 12-18 months.

The Threat Model Is Evolving

Our threat analysis identifies four primary risk vectors for 2026:

Prompt injection remains the top attack vector. Indirect prompt injection via untrusted data sources (emails, web pages, documents) drives agent actions that the user never intended. This is not a solved problem at the model level, making runtime action gating essential.

Supply-chain attacks on agent tools. As agents install packages, call APIs, and execute code, the software supply chain becomes a direct attack vector. A compromised npm package or a malicious API response can redirect agent behavior.

Multi-agent trust failures. Systems where agents delegate tasks to other agents create trust boundaries that are rarely enforced. An outer agent often grants an inner agent full access to its own capabilities without applying the principle of least privilege.

Cost and resource exploitation. Adversaries can trigger expensive compute operations (large file processing, excessive API calls) without gaining data access, simply to impose financial damage.

What Needs to Happen

Based on our research, we identify five priorities for the AI agent security community in 2026:

Standardize action gating. The industry needs a common framework for defining and enforcing agent action policies. SafeClaw's policy engine model is a strong starting point that other tools could adopt or interoperate with.

Make deny-by-default the norm. Our research on deny-by-default vs. allow-by-default shows a 94% reduction in incidents. This should be the default posture for all production agents.

Mandate tamper-evident audit logs. Flat log files are insufficient for accountability and compliance. Hash-chain audit logs should be standard.

Develop agent-specific red teaming methodologies. Current LLM red teaming focuses on model outputs. Agent red teaming must target the full action space, including multi-step chains and cross-tool exploits.

Fund open-source safety tooling. Proprietary safety solutions create vendor lock-in and resist independent audit. The open-source approach enables community scrutiny and faster iteration.

Outlook

The AI agent security field is roughly where web application security was in 2005: the attacks are known, the defenses exist, but adoption lags deployment. We expect 2026 to be the year that regulatory pressure and high-profile incidents force a correction. Organizations that invest in agent safety infrastructure now will be significantly better positioned than those that wait.

15 Research Lab publishes independent research on AI safety and security. Contact us for the full dataset and methodology.