Security

Securing AI Agents: MCP Tool Poisoning Risks

3 min read

Summary

Microsoft Incident Response warns that as AI agents move from reading content to taking actions, poisoned Model Context Protocol (MCP) tool metadata can silently redirect agent behavior and expose sensitive data. The guidance outlines how to detect, contain, and prevent this emerging supply chain risk using controls across Copilot Studio, Entra, Purview, Defender, and Sentinel.

Need help with Security?Talk to an Expert

Securing AI agents against MCP tool poisoning

Introduction

As enterprise AI agents evolve from passive assistants into tools that can act on behalf of users, the security stakes are rising quickly. Microsoft Incident Response has highlighted a growing attack pattern where malicious changes to Model Context Protocol (MCP) tool metadata can manipulate an agent into taking unintended actions, including exposing sensitive business data.

This matters for organizations building agents with Microsoft 365 Copilot, Copilot Studio, and Azure AI Foundry, especially where external tools and third-party MCP servers are part of production workflows.

What’s new

Microsoft’s latest guidance focuses on MCP tool poisoning, an attack that targets the agentic AI supply chain rather than a vulnerability in Copilot itself.

How the attack works

  • A trusted third-party MCP tool is updated with malicious instructions hidden inside its natural-language description.
  • The agent reads that metadata as part of its decision-making context.
  • Without visible warning, the agent performs extra steps beyond the user’s request.
  • Sensitive data can then be sent to an attacker-controlled endpoint through an otherwise approved tool call.

Microsoft illustrates this with a finance workflow in Copilot Studio, where an invoice validation tool is modified to quietly collect and forward summaries of unpaid invoices.

Why it matters for IT and security teams

The key risk is the trust boundary between agents and external tools. Even when permissions, allowlists, and connectors appear valid, poisoned metadata can alter agent behavior in production.

For administrators, this means:

  • Tool descriptions should be treated like system prompts
  • MCP servers must be governed as supply chain dependencies
  • Runtime monitoring is needed for suspicious tool use and outbound activity
  • Human approval may be necessary for high-impact actions

Microsoft maps several controls to this threat:

  • Govern the supply chain: Maintain an allowlist of approved MCP publishers and servers, and avoid broad “Allow all” MCP access.
  • Inspect tool metadata: Use Prompt Shields in Azure AI Content Safety and Defender for Cloud AI workload protection to inspect prompts, tool outputs, and metadata.
  • Guard sensitive actions: Apply Microsoft Purview DLP to inspect outbound tool parameters and block sensitive data exfiltration.
  • Use identity controls: Assign agents a non-human identity with Microsoft Entra Agent ID and enforce Conditional Access.
  • Correlate telemetry: Forward MCP telemetry to Microsoft Sentinel, review Purview audit logs, and watch for new endpoints in Defender for Cloud Apps.

Next steps

Organizations using agentic AI should review all production MCP integrations now. Start by inventorying approved MCP servers, requiring change review for metadata updates, and adding human-in-the-loop approvals for critical workflows such as finance, external sharing, and account changes.

The broader takeaway is clear: as AI tools move from reading to acting, security controls must extend beyond prompts and permissions to the full agent supply chain.

Need help with Security?

Our experts can help you implement and optimize your Microsoft solutions.

Talk to an Expert

Stay updated on Microsoft technologies

AI agentsMCPCopilot StudioMicrosoft Securitydata exfiltration

Related Posts

Security

Microsoft Security June 2026: Key Updates for IT

Microsoft’s June 2026 security updates introduce new protections for AI agents, stronger identity recovery in Entra, expanded multicloud coverage in Defender for Cloud, and more flexible reporting in Purview. These changes matter for IT and security teams because they improve visibility, speed remediation, and help protect identities, data, endpoints, and cloud workloads across hybrid environments.

Security

Malicious Chromium Extension Hijacks Search via AI Branding

Microsoft Threat Intelligence uncovered a malicious Chromium extension that spoofed Perplexity AI branding to intercept browser searches and search suggestions through attacker-controlled infrastructure. The finding matters because it shows how threat actors are using trusted AI brands and browser extension permissions to capture user input, redirect traffic, and increase privacy and security risk in enterprise environments.

Security

Node.js Hospitality Phishing Campaign Hits Hotel Staff

Microsoft Threat Intelligence has detailed an active phishing campaign targeting hospitality organizations with photo-themed ZIP files that deliver a Node.js implant for persistence. The campaign matters because it combines trusted-service abuse, PowerShell obfuscation, registry persistence, and non-standard C2 traffic to evade detection and potentially stage follow-on attacks.

Security

Microsoft Intune Named a Leader in Forrester Wave

Microsoft says it has been named a Leader in The Forrester Wave for Endpoint Management Platforms, Q2 2026, highlighting Intune’s integrated approach to endpoint management, security, identity, and AI governance. The announcement matters for IT teams because Microsoft is expanding bundled Intune capabilities, adding Linux support, and positioning Intune as a central policy layer for managing both devices and AI agents.

Security

Microsoft CNAPP Evolution: Unified Cloud Risk Focus

Microsoft says the CNAPP market is moving beyond basic visibility and compliance toward unified, context-aware cloud risk operations. The update highlights how Microsoft Defender for Cloud correlates posture, identity, data, and runtime signals to help security teams prioritize exploitable risks across multicloud and AI-driven environments.

Security

StealC and Amadey Threats: Microsoft Disrupts C2

Microsoft detailed how the StealC infostealer and Amadey malware loader fuel credential theft, account takeover, and downstream ransomware attacks. The company also announced a coordinated disruption with Europol and partners to take down more than 200 related command-and-control domains and IPs, giving defenders new insight into how these threats operate and how to respond.