Security

Agentic AI Failure Modes Taxonomy Updated by Microsoft

3 min read

Summary

Microsoft has updated its taxonomy of failure modes in agentic AI systems after a year of red teaming against real-world deployments. The v2.0 framework adds seven new risk categories and expanded mitigations, giving security teams a more practical model for assessing agentic AI threats such as MCP/plugin abuse, goal hijacking, and session context contamination.

Need help with Security?Talk to an Expert

Introduction

Microsoft has released a major update to its taxonomy of failure modes in agentic AI systems, reflecting lessons learned from 12 months of red teaming. For security leaders, architects, and IT administrators evaluating AI agents, this matters because the threat model is evolving quickly as plugins, MCP integrations, and computer-use agents move into production.

The new v2.0 taxonomy is more than a theoretical framework. It is based on observed attack patterns in deployed environments and highlights where existing controls are falling short.

What’s new in the updated taxonomy

Microsoft added seven new failure mode categories:

  • Agentic supply chain compromise: Malicious instructions delivered through plugins, MCP servers, prompt templates, or third-party integrations.
  • Goal hijacking: Adversarial instructions subtly redirect an agent’s objective without fully compromising it.
  • Inter-agent trust escalation: A compromised agent abuses weak identity or permission checks in multi-agent workflows.
  • Computer Use Agent visual attack: GUI-based agents are manipulated through hidden or adversarial visual content.
  • Session context contamination: Early session inputs bias later reasoning across multi-step tasks.
  • MCP/plugin abuse: Tool description poisoning, server-side instruction injection, and cross-server override attacks.
  • Capability/architecture disclosure: Agents expose internal prompts, schemas, tools, or approval logic that attackers can weaponize.

Key red team findings

Microsoft says several patterns appeared consistently across engagements:

  • Human-in-the-loop bypass was one of the most frequently exploited weaknesses.
  • Cross-domain prompt injection remained a reliable initial access method.
  • Memory poisoning and XPIA often worked together to persist malicious influence.
  • Zero-click attack chains were demonstrated in some cases, leading to exfiltration or lateral movement.
  • Capability disclosure often enabled deeper exploitation by turning black-box probing into white-box attacks.

Why this matters for IT and security teams

Organizations deploying agentic AI can no longer treat these systems like standard chatbots. Agents interact with tools, memory, external services, and sometimes graphical interfaces, which creates new attack paths that traditional application security models do not fully cover.

For administrators and security teams, the update reinforces that AI governance must include supply chain controls, behavioral monitoring, and stronger trust validation across tools and agents.

This quarter, teams should consider:

  • Reviewing all MCP servers, plugins, and third-party agent components as part of the software supply chain.
  • Verifying signatures, provenance, and dependency inventories for agent-connected tools.
  • Testing for prompt injection, memory poisoning, and approval bypass in red team or tabletop exercises.
  • Limiting unnecessary disclosure of system prompts, tool schemas, and internal architecture details.
  • Adding monitoring that evaluates full-session behavior, not just single prompts or events.

Microsoft’s update is a useful signal for enterprises: if you are deploying agentic AI, your security model needs to evolve just as quickly as the technology.

Need help with Security?

Our experts can help you implement and optimize your Microsoft solutions.

Talk to an Expert

Stay updated on Microsoft technologies

agentic AIAI securityMicrosoft Securityred teamingMCP

Related Posts

Security

Red Hat npm Miasma Attack Hits CI/CD Supply Chains

Microsoft Threat Intelligence uncovered a large-scale npm supply chain attack involving trojanized packages under the @redhat-cloud-services scope. The campaign abused a compromised CI/CD publishing workflow to deliver credential-stealing malware targeting GitHub, npm, AWS, Azure, GCP, Kubernetes, and developer systems, making it especially relevant for security teams and DevOps administrators.

Security

Microsoft Build 2026 Security: Code, Agents, Models

At Microsoft Build 2026, Microsoft announced new security capabilities to protect code, AI agents, and models across the development lifecycle. Highlights include the expanded preview of MDASH for exploitability-focused vulnerability discovery and general availability of Microsoft Defender integration with GitHub Code Security to help teams prioritize and remediate real risks faster.

Security

npm Dependency Confusion Attack Targets Developer Environments

Microsoft Threat Intelligence uncovered 33 malicious npm packages that abused dependency confusion to impersonate internal corporate packages and silently profile developer systems during installation. The campaign matters because it targets developer workstations and CI/CD environments, creating a foothold for potential follow-on supply chain attacks.

Security

Microsoft Defender Named a 2026 Endpoint Leader

Microsoft says it has been named a Leader in the 2026 Gartner Magic Quadrant for Endpoint Protection for the seventh consecutive time. The announcement highlights recent Microsoft Defender for Endpoint enhancements, including attack disruption, custom telemetry, simplified onboarding, sovereign-ready deployment options, and protection for local AI agents.

Security

Typosquatted npm Packages Steal Cloud and CI/CD Secrets

Microsoft has uncovered an active npm supply chain attack in which 14 typosquatted packages stole AWS credentials, HashiCorp Vault tokens, GitHub Actions data, and npm publish tokens during installation. The campaign matters because it targets developer and build environments, creating risk of cloud lateral movement, CI/CD compromise, and downstream software supply chain attacks.

Security

The Gentlemen Ransomware: Self-Propagating Go Threat

Microsoft Threat Intelligence has published a deep technical analysis of The Gentlemen ransomware, a Go-based ransomware-as-a-service threat that combines strong file encryption with aggressive self-propagation. The research matters for defenders because the malware can rapidly spread across local systems and network shares, increasing the blast radius of a single compromise.