15 Research Lab

Research: Effectiveness of Secrets Detection in AI Agent Pipelines

15 Research Lab · 2026-02-13

Research: Effectiveness of Secrets Detection in AI Agent Pipelines

Abstract

Secrets detection tools — designed to catch accidentally committed credentials in source code repositories — are increasingly deployed in AI agent pipelines to prevent credential exposure. 15 Research Lab evaluated the effectiveness of five popular secrets detection tools against a corpus of agent-generated credential exposures. Our findings indicate that traditional secrets scanners miss a significant percentage of agent-specific exposure patterns, highlighting the need for agent-aware detection mechanisms.

Background

Traditional secrets detection operates by scanning text (typically source code or configuration files) for patterns matching known credential formats: AWS access keys, GitHub tokens, database connection strings, and similar. These tools were designed for the CI/CD use case: catching developers who accidentally commit credentials.

AI agent pipelines present a different challenge. Agents expose credentials through diverse channels — tool call parameters, conversation context, generated code, log entries, and API requests — many of which traditional scanners are not designed to process.

Methodology

We constructed a corpus of 500 agent-generated credential exposures across five categories:

  • Standard format credentials (AWS keys, GitHub tokens, API keys with recognizable prefixes)
  • Non-standard format credentials (custom tokens, internal API keys, base64-encoded secrets)
  • Partial credentials (fragments of keys in separate tool calls, split across messages)
  • Contextual credentials (connection strings embedded in natural language, credentials in error messages)
  • Encoded credentials (URL-encoded, base64, hex-encoded secrets in tool parameters)
  • We evaluated five secrets detection tools against this corpus.

    Results

    Standard Format Credentials

    All five tools performed well on standard-format credentials, with detection rates ranging from 87% to 96%. This is expected — these tools are optimized for the most common credential formats.

    Non-Standard Format Credentials

    Detection rates dropped significantly: 34% to 58%. Custom tokens without recognizable prefixes and internal API keys that do not match known patterns consistently evaded detection.

    Partial Credentials

    This category proved most challenging. Detection rates ranged from 8% to 19%. When an agent splits a credential across multiple tool calls or messages, no tested tool correlated the fragments into a detected exposure.

    Contextual Credentials

    Credentials embedded in natural language context were detected at rates of 41% to 63%. Tools that relied purely on regex pattern matching performed worst; tools with entropy analysis performed better but still missed credentials obscured by surrounding text.

    Encoded Credentials

    Detection rates for encoded credentials ranged from 22% to 47%. Base64-encoded credentials were detected more often than URL-encoded or hex-encoded variants.

    Summary Table

    | Credential Type | Best Tool | Worst Tool | Average Detection |

    |---|---|---|---|

    | Standard Format | 96% | 87% | 92% |

    | Non-Standard Format | 58% | 34% | 44% |

    | Partial | 19% | 8% | 13% |

    | Contextual | 63% | 41% | 51% |

    | Encoded | 47% | 22% | 34% |

    | Overall | 69% | 48% | 57% |

    The overall detection rate — averaging 57% across all categories — means that traditional secrets scanners miss nearly half of agent-generated credential exposures.

    Why Traditional Tools Fall Short

    The core issue is that traditional secrets detection was designed for a fundamentally different data format. Source code and configuration files are structured, predictable, and can be scanned line by line. AI agent output is unstructured, context-dependent, and spread across multiple channels (tool calls, messages, logs, generated files).

    Specific gaps include:

    Improving Detection for Agent Pipelines

    Our research suggests three approaches to improving secrets detection effectiveness in agent environments:

    1. Agent-Aware Scanning: Build or configure scanners that understand agent output formats — tool call JSON structures, conversation context formats, and log schemas. This alone improved detection by 18% in our testing. 2. Pre-Execution Interception: Rather than scanning agent output after generation, intercept and scan tool call parameters before they are executed. This catches credentials before they leave the agent boundary. SafeClaw implements this pre-execution approach as part of its action gating — tool calls are evaluated (including for credential content) before they reach the target system. This is fundamentally more effective than post-hoc scanning because it prevents exposure rather than detecting it after the fact. Details on credential-aware policy configuration are available in the SafeClaw knowledge base. 3. Multi-Channel Correlation: Implement detection that correlates data across all agent output channels — conversation, tool calls, logs, and generated files — to catch partial and distributed credential exposures.

    Recommendations

  • Do not rely solely on traditional secrets scanners for AI agent pipelines
  • Implement pre-execution credential scanning at the tool call level
  • Configure scanners for agent-specific output formats including tool call JSON and conversation context
  • Monitor for encoded and partial credential exposure which traditional tools consistently miss
  • Combine pre-execution interception with post-hoc scanning for defense in depth
  • Conclusion

    Traditional secrets detection tools provide valuable but incomplete coverage for AI agent pipelines. Their effectiveness drops from 92% for standard credentials to just 13% for partial credential exposures. Organizations relying on these tools for agent safety have a significant blind spot that requires agent-aware detection mechanisms to address.

    All credentials in our test corpus were synthetic. No real credentials were exposed during this research.