MCP Tool Calls Are Becoming a Security Signal for AI Agents

An AI agent is asked to prepare a customer renewal summary. To complete the task, it calls a file-reading tool, queries an account database, summarizes the result, and then invokes an email tool. Each tool call may look legitimate. The security question is whether the full sequence still looks legitimate once the calls, arguments, responses, identities, and data movement are viewed together. That is where MCP tool-call traffic becomes relevant.

Recent MCP security research is beginning to treat agent tool-call sessions as a monitorable attack surface. The central idea is straightforward: even when an agentʼs internal reasoning is opaque, its tool use leaves structured traces through tool names, arguments, responses, errors, identity context, and call order.

A May 2026 paper titled “MCPShield: Content-Aware Attack Detection for LLM Agent Tool-Call Traffic” focuses directly on this layer. It frames MCP sessions as traffic that can be modeled, inspected, and classified for abnormal behavior, including data exfiltration, malicious tool use, recursive tool injection, and suspicious tool-call sequences.

Model Context Protocol, or MCP, gives AI agents a standard way to discover and call external tools. Under the MCP tools specification, servers expose tools with names, descriptions, input schemas, optional output schemas, and annotations. Clients can list available tools through tools/list and invoke a selected tool through tools/call.

This creates a structured interaction layer between the agent and external systems. A tool call is not just a natural language instruction. It is a protocol event with a method, parameters, tool name, arguments, response content, and sometimes structured output or errors.

That matters for security because agent attacks often become visible at this boundary. A prompt injection may begin in text, a poisoned tool description may influence planning, or a compromised tool may return malicious output. But when the agent acts, those decisions often pass through tool calls.

Multiple MCP security efforts are converging on the same broad concern from different angles: tool trust, tool poisoning, information flow, registry integrity, and runtime anomaly detection. The May 2026 MCPShield paper is the most directly relevant to this article because it treats MCP tool-call traffic itself as the detection object.

The enterprise question that follows is operational: not only which tools exist, but which tools are invoked, under which identity, with what parameters, and through which chain of agent behavior.

How the Mechanism Works

At the protocol level, MCP tool-call traffic is represented through JSON RPC messages. A _tools/list request allows a client to discover available tools. A tools/call request invokes a specific tool by name and passes an arguments object. The response can include content blocks, structured content, and error indicators. This gives defenders several observable fields:

Which tool was called

What arguments were passed

What response content came back

Whether the tool returned an error

What tool was called before and after

Which agent or client initiated the call

Which user, session, or authorization context was attached

Whether data from one tool appears to influence later tool calls

The May 2026 MCPShield paper models this as a session-level detection problem. In that model, each tool call becomes a node in a graph. Edges capture call order and data-flow relationships between calls. The system then uses content-aware features, including embeddings over serialized tool arguments and responses, to classify whether a session appears benign or malicious.

The important technical shift is that detection is not based only on tool metadata.

A metadata-only view might know that an agent called _{read_file}, _{query_db}, and

send_email, but it may miss the meaning of the arguments and outputs. A content-

aware view inspects the actual strings, fields, and returned values moving through the session.

For example, a single _{send_email} call may be normal. A sequence where the agent first reads a sensitive file, then queries customer records, then sends the combined output externally is different. The detection surface is the relationship between calls, not just the presence of one dangerous tool.

A related attack path is recursive tool injection. A malicious or compromised tool response may not only return bad data, but also influence what the agent does next. In that case, the suspicious behavior may appear across the sequence: the returned content from one call shapes the arguments, destination, or tool choice in a later call.

Related work expands this picture. MCPTox focuses on tool poisoning through malicious metadata or descriptions. MindGuard proposes decision-dependence graphs based on model attention patterns to detect and attribute poisoned tool selection. The formal MCP security framework describes information flow tracking as a way to detect cross-server data exfiltration. These approaches differ, but they share a common concern: agent tool use produces observable patterns that can reveal misuse, compromise, or unsafe delegation.

Analysis

This matters now because MCP is turning agent tool use into a standardized operational layer. When agents rely on tools to read files, query systems, send messages, update records, or execute code, the tool boundary becomes one of the few places where security teams can observe concrete behavior.

The modelʼs internal reasoning may remain unavailable. The prompt that influenced the agent may be incomplete, hidden in retrieved content, or spread across multiple sources. The tool-call layer is more concrete. It shows what the agent attempted to do.

The natural interception point is the MCP client, gateway, proxy, or logging layer, depending on how the organization runs agent infrastructure. A gateway-style pattern is especially relevant because it can inspect tool calls before they reach MCP servers and inspect responses before they return to the model. That does not make the gateway a complete defense, but it gives teams a place to apply policy, preserve traces, and detect abnormal tool behavior.

Recent disclosures also show that this layer can be risky in both directions. Toolcall traffic can support monitoring, but raw tool-call logging can expose sensitive data if arguments contain credentials, secrets, customer data, or file paths. CVE

2026 42282, involving n8n-MCP, described sensitive MCP _tools/call arguments being logged in HTTP mode before redaction. That makes the same traffic useful for detection and dangerous if captured without controls. Other disclosures show how _tools/call arguments can become an execution path.

CVE 2026 44336, involving PraisonAI MCP, described path traversal through

MCP file-handling tools, where arguments passed through _tools/call could reach unsafe file operations and enable code execution through Python _.pth injection.

The security implication is practical: the tool-call layer is not just telemetry. It is part of the attack surface.

Implications for Enterprises

For security teams, MCP tool-call traffic offers a possible detection surface that is closer to system behavior than prompt logs alone. It can show which tools were invoked, what data moved through them, and whether a session followed an unusual sequence.

For platform teams, this raises observability questions. MCP traffic may need to be captured in a way that preserves enough context for investigation without storing raw secrets or regulated data. Tool names alone may be insufficient. Full arguments may be too sensitive. Structured logging, redaction, tokenization, and access controls become design decisions rather than afterthoughts.

For governance teams, tool-call sessions create a potential evidence layer. They can show which tools were available, which tools were used, whether destructive or external-facing tools were invoked, and whether tool behavior matched approved purpose. This could support review of agent access, tool approval, and incident reconstruction.

A useful enterprise trace would connect user identity, agent identity, tool identity, and action identity in the same record. It would show not only that a user started an agent session, but which agent invoked which tool, with what arguments, under what authorization context, and what response was returned.

This identity-to-action traceability matters because agent systems can blur responsibility. A user may initiate a request, but the agent chooses the intermediate actions. A tool may execute the final operation, but the decision to invoke it may come from model reasoning, tool metadata, retrieved content, or a prior tool response. Monitoring has to preserve that chain.

The strongest fit appears to be tool misuse detection. Tool-call monitoring can help identify abnormal invocation patterns, suspicious argument values, cross-tool data movement, recursive tool injection, and sequences that do not match expected task behavior. It is less effective for risks that occur before or outside the tool-call layer, such as memory poisoning, inter-agent message manipulation, or human approval manipulation.

That boundary matters. Tool-call traffic is a valuable signal, but it is not a complete view of agent security.

Risks and Open Questions

The first risk is visibility without protection. Capturing tool-call arguments and responses can improve detection, but it can also create a sensitive log store. If secrets, customer records, or internal file paths are recorded without redaction or access control, monitoring can become a new exposure path.

The second risk is false confidence. A clean-looking tool sequence does not prove the agent was safe. Some attacks influence the agentʼs reasoning or tool choice before invocation. Tool poisoning may operate through metadata. Memory poisoning may occur upstream in retrieval systems. Human-agent trust exploitation may occur in the user interface. These may not be fully visible in MCP traffic.

The third open question is evaluation quality. MCPShieldʼs May 2026 work treats tool-call detection as a session classification problem, but published research still depends heavily on available datasets, benchmark construction, and assumptions about what the detector can observe. Real enterprise traces may be messier, more repetitive, more sensitive, and harder to label than research datasets.

The fourth issue is deployment timing. The May 2026 MCPShield paper frames detection at the session level, which is useful for audit and investigation. Real-time blocking would require additional design choices, such as where the detector sits, how fast it must run, and what level of confidence is needed before interrupting an agent.

The fifth issue is chained and nested tool use. If one agent triggers another agent, or if a tool response causes a downstream MCP call, monitoring systems need enough continuity to reconstruct the session rather than treating each invocation as an isolated API event. That requires context propagation across user, agent, session, tool, authorization, and response data.

The final question is standardization. MCP defines tool discovery and invocation, and the documentation includes security guidance such as input validation, user approval, access control, timeouts, output sanitization, and audit logging. It does not yet resolve every enterprise logging, retention, tenant isolation, or detection requirement. Those choices remain with implementers and platform owners.

Blogs

White Papers

Case Studies

Blogs

White Papers

Case Studies

MCP Tool Calls Are Becoming a Security Signal for AI Agents

MCP Tool Calls Are Becoming a Security Signal for AI Agents

How the Mechanism Works

Analysis

Implications for Enterprises

Risks and Open Questions

Further Reading

Contact Us

Contact Us

Contact Us

Contact Us