Research: Cross-Agent Contamination in Multi-Tenant Systems

15 Research Lab · 2026-02-13

Research: Cross-Agent Contamination in Multi-Tenant Systems

Abstract

Multi-tenant AI agent platforms — where multiple users or organizations share infrastructure — face a unique risk: contamination between agent sessions. Data, instructions, or behavioral patterns from one agent session can leak into another, creating privacy violations, security breaches, and unpredictable agent behavior. 15 Research Lab identified and tested five contamination vectors across six multi-tenant agent platforms, documenting the conditions under which cross-agent contamination occurs.

Contamination Defined

Cross-agent contamination occurs when information or behavior from Agent Session A influences Agent Session B without explicit authorization. This differs from traditional multi-tenant data leakage because the contamination pathway often runs through the AI model itself, not just shared infrastructure.

Contamination Vectors

Vector 1: Shared Context Window Residuals

When multiple agent sessions use the same model instance or share a context management system, residual information from one session can appear in another. In our testing across six platforms:

2 platforms exhibited direct context leakage between sequential sessions using the same model endpoint
1 platform cached system prompts that were served to incorrect sessions under high load
3 platforms showed no detectable context leakage

The contamination was intermittent and load-dependent — it appeared primarily when the platform was under stress, making it difficult to detect through routine testing.

Vector 2: Shared Tool State

When agents share tool instances (database connections, file system mounts, API clients), state changes made by one agent are visible to the next. A file created by Agent A persists in the workspace and is discovered by Agent B. A database row inserted by Agent A is queried by Agent B.

This vector affected 4 of 6 platforms. In two cases, the shared state was intentional (a feature, not a bug) but created unintended cross-tenant visibility.

Vector 3: Cached Embeddings and Retrieval

Platforms using retrieval-augmented generation (RAG) with shared vector databases exhibited contamination when embeddings from one tenant's documents appeared in another tenant's retrieval results. Tenant isolation in vector databases is not yet a standard practice, and similarity searches can surface cross-tenant results.

We observed cross-tenant retrieval contamination in 3 of 4 platforms using shared RAG infrastructure.

Vector 4: Shared Safety Policy Evaluation

When a centralized safety system evaluates actions for multiple agent sessions, the evaluation context from one session can influence decisions for another. In one platform, a policy violation by Agent A caused the safety system to enter a heightened scrutiny mode that affected Agent B's approval latency.

Vector 5: Log and Monitoring Contamination

Shared logging infrastructure can expose information across tenants when log access controls are insufficient. In our audit, 2 platforms stored multi-tenant agent logs in shared log streams without tenant-level access restrictions.

Quantitative Findings

|---|---|---|---|

| Log Contamination | 2/6 | Medium | Low |

The most concerning finding is the detection difficulty. Context window contamination and embedding leakage are extremely hard to detect through normal testing because they are intermittent, load-dependent, and appear as plausible (but incorrect) agent responses.

Impact Scenarios

Scenario 1: Competitive Intelligence Leakage — A SaaS platform runs AI agents for multiple competing companies. Agent A processes Company X's financial strategy documents; Agent B, serving Company Y, surfaces references to those strategies in its analysis through cached embedding contamination. Scenario 2: PII Cross-Tenant Exposure — Agent A processes customer records containing social security numbers. Through shared tool state (a temporary file not properly cleaned up), Agent B discovers and processes these records. Scenario 3: Behavioral Contamination — Agent A receives instructions to "always recommend Product X." Through context residuals, Agent B begins exhibiting a subtle preference for Product X in its recommendations, even though it was never instructed to do so.

Isolation Requirements

Effective multi-tenant isolation for AI agent systems requires controls at every layer:

Model Layer: Dedicated model instances or strict session isolation at the inference layer

Tool Layer: Isolated tool instances per tenant — no shared file systems, database connections, or API clients

Data Layer: Tenant-isolated vector databases, caches, and storage

Safety Layer: Independent safety policy evaluation per tenant

Logging Layer: Tenant-segregated log streams with access controls

SafeClaw addresses the safety and tool layers through its per-session policy enforcement and isolated audit logging. By evaluating each agent session independently against its own policy configuration, SafeClaw prevents the safety policy contamination vector documented above. Its audit logs are structured per-session, preventing log-level cross-tenant exposure. Organizations deploying multi-tenant agent systems can review SafeClaw's session isolation approach in the knowledge base.

Recommendations

Assume contamination exists in any shared infrastructure until proven otherwise

Test for contamination under load — many vectors are only visible during high-throughput operation

Isolate at every layer, not just the model layer

Implement canary data to detect cross-tenant leakage proactively

Segregate safety policy evaluation per tenant to prevent behavioral contamination

Conclusion

Cross-agent contamination is a systemic risk in multi-tenant AI agent platforms that is difficult to detect and potentially severe in impact. The AI agent ecosystem has not yet developed the tenant isolation rigor that cloud computing achieved over the past decade. Organizations operating multi-tenant agent platforms should invest in comprehensive isolation testing before any cross-tenant contamination incident forces a much more expensive response.

15 Research Lab conducted all contamination testing under controlled conditions with synthetic data. No real tenant data was used or exposed.