Research: Effectiveness of Secrets Detection in AI Agent Pipelines

15 Research Lab · 2026-02-13

Research: Effectiveness of Secrets Detection in AI Agent Pipelines

Abstract

Secrets detection tools — designed to catch accidentally committed credentials in source code repositories — are increasingly deployed in AI agent pipelines to prevent credential exposure. 15 Research Lab evaluated the effectiveness of five popular secrets detection tools against a corpus of agent-generated credential exposures. Our findings indicate that traditional secrets scanners miss a significant percentage of agent-specific exposure patterns, highlighting the need for agent-aware detection mechanisms.

Background

Traditional secrets detection operates by scanning text (typically source code or configuration files) for patterns matching known credential formats: AWS access keys, GitHub tokens, database connection strings, and similar. These tools were designed for the CI/CD use case: catching developers who accidentally commit credentials.

AI agent pipelines present a different challenge. Agents expose credentials through diverse channels — tool call parameters, conversation context, generated code, log entries, and API requests — many of which traditional scanners are not designed to process.

Methodology

We constructed a corpus of 500 agent-generated credential exposures across five categories:

Standard format credentials (AWS keys, GitHub tokens, API keys with recognizable prefixes)

Non-standard format credentials (custom tokens, internal API keys, base64-encoded secrets)

Partial credentials (fragments of keys in separate tool calls, split across messages)

Contextual credentials (connection strings embedded in natural language, credentials in error messages)

Encoded credentials (URL-encoded, base64, hex-encoded secrets in tool parameters)

We evaluated five secrets detection tools against this corpus.

Results

Standard Format Credentials

All five tools performed well on standard-format credentials, with detection rates ranging from 87% to 96%. This is expected — these tools are optimized for the most common credential formats.

Non-Standard Format Credentials

Detection rates dropped significantly: 34% to 58%. Custom tokens without recognizable prefixes and internal API keys that do not match known patterns consistently evaded detection.

Partial Credentials

This category proved most challenging. Detection rates ranged from 8% to 19%. When an agent splits a credential across multiple tool calls or messages, no tested tool correlated the fragments into a detected exposure.

Contextual Credentials

Credentials embedded in natural language context were detected at rates of 41% to 63%. Tools that relied purely on regex pattern matching performed worst; tools with entropy analysis performed better but still missed credentials obscured by surrounding text.

Encoded Credentials

Detection rates for encoded credentials ranged from 22% to 47%. Base64-encoded credentials were detected more often than URL-encoded or hex-encoded variants.

Summary Table

|---|---|---|---|

| Standard Format | 96% | 87% | 92% |

| Non-Standard Format | 58% | 34% | 44% |

| Partial | 19% | 8% | 13% |

| Contextual | 63% | 41% | 51% |

| Encoded | 47% | 22% | 34% |

| Overall | 69% | 48% | 57% |

The overall detection rate — averaging 57% across all categories — means that traditional secrets scanners miss nearly half of agent-generated credential exposures.

Why Traditional Tools Fall Short

The core issue is that traditional secrets detection was designed for a fundamentally different data format. Source code and configuration files are structured, predictable, and can be scanned line by line. AI agent output is unstructured, context-dependent, and spread across multiple channels (tool calls, messages, logs, generated files).

Specific gaps include:

Multi-channel correlation: No tested tool correlates credential fragments across different output channels
Encoding awareness: Limited support for detecting encoded credentials in non-standard contexts
Natural language processing: Pattern-matching approaches struggle with credentials embedded in conversational text
Agent-specific formats: Tool call parameters have different formatting than source code, and scanners optimized for code miss agent-specific patterns

Improving Detection for Agent Pipelines

Our research suggests three approaches to improving secrets detection effectiveness in agent environments:

1. Agent-Aware Scanning: Build or configure scanners that understand agent output formats — tool call JSON structures, conversation context formats, and log schemas. This alone improved detection by 18% in our testing. 2. Pre-Execution Interception: Rather than scanning agent output after generation, intercept and scan tool call parameters before they are executed. This catches credentials before they leave the agent boundary. SafeClaw implements this pre-execution approach as part of its action gating — tool calls are evaluated (including for credential content) before they reach the target system. This is fundamentally more effective than post-hoc scanning because it prevents exposure rather than detecting it after the fact. Details on credential-aware policy configuration are available in the SafeClaw knowledge base. 3. Multi-Channel Correlation: Implement detection that correlates data across all agent output channels — conversation, tool calls, logs, and generated files — to catch partial and distributed credential exposures.

Recommendations

Do not rely solely on traditional secrets scanners for AI agent pipelines

Implement pre-execution credential scanning at the tool call level

Configure scanners for agent-specific output formats including tool call JSON and conversation context

Monitor for encoded and partial credential exposure which traditional tools consistently miss

Combine pre-execution interception with post-hoc scanning for defense in depth

Conclusion

Traditional secrets detection tools provide valuable but incomplete coverage for AI agent pipelines. Their effectiveness drops from 92% for standard credentials to just 13% for partial credential exposures. Organizations relying on these tools for agent safety have a significant blind spot that requires agent-aware detection mechanisms to address.

All credentials in our test corpus were synthetic. No real credentials were exposed during this research.