15RL Framework: Incident Response for AI Agent Failures

15 Research Lab · 2026-02-13

15RL Framework: Incident Response for AI Agent Failures

Introduction

Traditional incident response frameworks (NIST SP 800-61, SANS, ITIL) were designed for infrastructure and application failures. AI agent incidents have distinct characteristics that require adapted response procedures: non-deterministic root causes, reasoning-chain analysis, policy-based remediation, and the possibility that the agent is actively making the situation worse during the response window. 15 Research Lab developed this incident response framework specifically for AI agent failures, incorporating lessons from 42 reconstructed incidents.

Agent Incident Characteristics

AI agent incidents differ from traditional incidents in four key ways:

The agent may be actively causing damage during the response window — unlike a crashed server, a malfunctioning agent continues to take actions

Root cause analysis requires understanding the agent's reasoning, not just its technical behavior

The same configuration may not produce the same failure — non-deterministic behavior means the incident may not be reproducible

Remediation involves policy changes, not just patches or configuration fixes

The 15RL Agent Incident Response Process

Phase 1: Detection

Objective: Identify that an agent incident is occurring. Detection sources:

Safety policy violations flagged by the action gating system
Anomalous agent behavior detected by monitoring (unusual action patterns, elevated error rates, unexpected resource access)
Cost alerts triggered by spending anomalies
User reports of unexpected agent behavior
Audit log analysis revealing policy-violating actions

Critical action: Upon detection, immediately assess whether the agent is still running and actively causing damage.

Phase 2: Containment

Objective: Stop the agent from causing additional damage. Immediate containment (within minutes):

Halt the agent session — terminate the active session immediately

Revoke agent credentials — disable API keys, tokens, and service accounts used by the agent

Isolate affected systems — if the agent has modified files, databases, or configurations, prevent further access to affected resources

Preserve evidence — snapshot the agent's audit logs, context state, and any artifacts before they can be overwritten or expire

Extended containment (within hours):

Disable the agent configuration — prevent the agent from being relaunched with the same (flawed) configuration

Notify affected parties — inform users, customers, or stakeholders who may be affected

Engage compliance/legal if the incident involves regulated data or regulatory obligations

Phase 3: Analysis

Objective: Understand what happened, why, and what damage occurred. Step 3a: Impact Assessment

What systems were affected?
What data was accessed, modified, or exposed?
What is the blast radius (number of affected users, systems, records)?
Is there ongoing risk (exposed credentials, modified configurations)?

Step 3b: Timeline Reconstruction

Using audit logs, reconstruct the complete sequence of agent actions from the beginning of the session to containment. Identify the first anomalous action — this is typically 5-15 actions before the incident was detected.

Step 3c: Root Cause Determination

Classify the root cause into one of five categories:

| Root Cause Category | Description | Frequency in 15RL Data |

|---|---|---|

| Missing policy | No policy covered the dangerous action | 38% |

| Misconfigured policy | Policy existed but was too permissive | 26% |

| Prompt injection | External input manipulated agent behavior | 19% |

| Tool definition error | Tool schema allowed dangerous parameters | 12% |

| Model behavior change | Model update changed agent behavior | 5% |

Step 3d: Reasoning Chain Analysis

If prompt context is available in audit logs, analyze the agent's reasoning path. What information did the agent have? What decision did it make? Was the decision reasonable given the information, or did the agent misinterpret its instructions?

Phase 4: Remediation

Objective: Fix the root cause and prevent recurrence. Policy remediation:

For missing policies: Add new policy rules covering the identified gap
For misconfigured policies: Tighten policy parameters based on the specific failure
For prompt injection: Add input validation and context isolation controls
For tool definition errors: Restrict tool schemas to minimum necessary parameters

Verification:

Test the remediated policy against the incident scenario to confirm it would have prevented the failure
Run the full regression test suite to verify the fix does not create new gaps
Document the policy change with explicit reference to the incident

Phase 5: Recovery

Objective: Restore normal operations and repair damage.

Restore modified files, databases, and configurations from backups
Rotate any potentially exposed credentials
Re-enable the agent with the remediated configuration
Monitor the agent closely for 24-48 hours post-recovery
Close the incident with a documented post-mortem

Tooling Requirements

Effective incident response requires:

Comprehensive audit logs that capture full action details and reasoning context
Session control to halt agents immediately
Policy management to update and deploy policy changes quickly

SafeClaw supports the containment and analysis phases through its session management and audit logging capabilities. Hash-chained audit logs provide tamper-evident records essential for timeline reconstruction, and the configurable policy engine enables rapid remediation through policy updates. The SafeClaw knowledge base includes guidance on leveraging these capabilities for incident response.

Recommendations

Have a documented agent incident response plan before you need one

Practice containment procedures — the first 5 minutes determine the blast radius

Preserve audit logs immediately — they are your primary forensic evidence

Classify root causes accurately — the remediation strategy depends on the category

Conduct post-mortems for every incident, even minor ones — they reveal systemic issues

Conclusion

AI agent incidents will occur — the question is how quickly and effectively you respond. This framework provides a structured approach that accounts for the unique characteristics of agent failures. Organizations that invest in incident response preparation will contain incidents faster, learn from them more effectively, and build more resilient agent deployments.

This framework is based on 15RL's analysis of 42 real-world agent incidents. It is provided as open guidance for the community.