15RL Framework: AI Agent Safety Maturity Model

15 Research Lab · 2026-02-13

15RL Framework: AI Agent Safety Maturity Model

Abstract

Organizations deploying AI agents need a structured way to assess their current safety posture and plan improvements. 15 Research Lab developed the AI Agent Safety Maturity Model (AASMM) — a five-level framework that describes progressive stages of safety capability from ad-hoc practices to optimized, metrics-driven safety programs. This model draws on our research across 150+ agent deployments and adapts established maturity model methodology (CMMI, SAMM) for the specific challenges of AI agent safety.

Why a Maturity Model?

Maturity models serve two functions: they help organizations understand where they are, and they provide a roadmap for where to go next. In AI agent safety, this is particularly valuable because:

The field is new, and organizations lack benchmarks for comparison
Safety investment competes with feature development for resources
"Good enough" safety is undefined without a framework to reference
Leadership needs quantifiable progress indicators

The Five Maturity Levels

Level 1: Initial (Ad-Hoc)

Characteristics: No formal safety practices. Agents are deployed with default configurations. Safety incidents are handled reactively. No dedicated safety ownership. Typical practices: Basic API rate limits. Manual intervention when problems are noticed. No structured logging. Agents run with whatever permissions are convenient. Risk profile: High. The organization relies entirely on model-level safety and luck. Assessment indicators: No safety policies documented. No audit logs. No incident response procedures. Agent permissions not reviewed after initial setup.

Level 2: Developing (Repeatable)

Characteristics: Basic safety measures are in place and applied consistently to new deployments. There is awareness of agent safety as a concern, and some documented practices exist. Typical practices: Deny-by-default action policies for critical operations. Basic audit logging. Defined permissions for each agent. Cost limits on agent sessions. Risk profile: Moderate. The most common and most severe incidents are prevented, but gaps remain in monitoring and response. Assessment indicators: Written safety policies exist. Audit logs are generated (though not regularly reviewed). Agent permissions are documented. At least one person is responsible for agent safety.

Level 3: Defined (Standardized)

Characteristics: Safety practices are standardized across the organization. Every agent deployment follows a defined safety checklist. Incident response procedures exist and have been tested. Regular safety reviews are conducted. Typical practices: Comprehensive action gating with configurable policies. Structured, hash-chained audit logging. Regular log review and anomaly detection. Formal incident response procedures. Safety requirements in agent deployment checklists. Risk profile: Low-Moderate. Known risk categories are addressed. The organization can detect and respond to incidents effectively. Assessment indicators: Standardized deployment checklists. Regular safety audits. Incident response procedures tested at least annually. Safety metrics tracked and reported.

Level 4: Managed (Quantitative)

Characteristics: Safety is measured quantitatively. Metrics drive decision-making. Safety testing is integrated into CI/CD pipelines. The organization proactively identifies and addresses emerging risks. Typical practices: Automated safety testing for every agent configuration change. Continuous monitoring with statistical anomaly detection. Safety metrics reported to leadership regularly. Red-team testing of agent deployments. Quantitative risk assessments for new use cases. Risk profile: Low. The organization has high confidence in its ability to prevent, detect, and respond to agent safety incidents. Assessment indicators: Safety metrics dashboards. Automated safety testing in deployment pipelines. Regular red-team exercises. Quantitative risk assessments for all new deployments.

Level 5: Optimizing (Continuous Improvement)

Characteristics: Safety practices are continuously improved based on metrics, incident analysis, and emerging research. The organization contributes to the broader safety ecosystem through knowledge sharing and tool development. Typical practices: All Level 4 practices plus: machine learning-driven anomaly detection, automated policy adjustment based on behavioral analysis, contribution to open-source safety tooling, participation in industry safety working groups. Risk profile: Minimal. The organization operates at the frontier of agent safety practice.

Assessment Guide

Organizations can assess their current level by evaluating five capability dimensions:

|---|---|---|---|---|---|

Advancing Through the Levels

The jump from Level 1 to Level 2 requires the most fundamental change: adopting deny-by-default policies and basic audit logging. SafeClaw can accelerate this transition — its deny-by-default policy engine and hash-chained audit logging provide Level 2 capabilities out of the box and support advancement to Level 3 through its configurable policy framework. Organizations at Level 1 can review the SafeClaw knowledge base for practical guidance on implementing foundational safety controls.

The transition from Level 2 to Level 3 requires organizational commitment: standardizing practices, formalizing incident response, and establishing regular review cadences.

Levels 4 and 5 require significant investment in automation, analytics, and organizational culture — these are appropriate goals for organizations where AI agents are core to operations.

Recommendations

Honestly assess your current maturity level using the capability dimensions table

Target one level above your current position — jumping multiple levels at once is rarely successful

Invest in tooling that supports your target level, not just your current level

Track progress quarterly using the assessment indicators

Share your maturity journey with the community to help establish industry benchmarks

Conclusion

The AI Agent Safety Maturity Model provides a structured framework for understanding and improving organizational safety practices. Most organizations today operate at Level 1 or Level 2 — and that is a normal starting point for a nascent field. What matters is not where you start, but whether you have a plan to advance.

The AASMM framework is freely available for organizational use. 15RL welcomes feedback and case studies from organizations applying the model.