Agentic AI Failure Modes Taxonomy Updated by Microsoft

June 4, 20263 min read

Summary

Microsoft has updated its taxonomy of failure modes in agentic AI systems after a year of red teaming against real-world deployments. The v2.0 framework adds seven new risk categories and expanded mitigations, giving security teams a more practical model for assessing agentic AI threats such as MCP/plugin abuse, goal hijacking, and session context contamination.

Introduction

Microsoft has released a major update to its taxonomy of failure modes in agentic AI systems, reflecting lessons learned from 12 months of red teaming. For security leaders, architects, and IT administrators evaluating AI agents, this matters because the threat model is evolving quickly as plugins, MCP integrations, and computer-use agents move into production.

The new v2.0 taxonomy is more than a theoretical framework. It is based on observed attack patterns in deployed environments and highlights where existing controls are falling short.

What’s new in the updated taxonomy

Microsoft added seven new failure mode categories:

Agentic supply chain compromise: Malicious instructions delivered through plugins, MCP servers, prompt templates, or third-party integrations.
Goal hijacking: Adversarial instructions subtly redirect an agent’s objective without fully compromising it.
Inter-agent trust escalation: A compromised agent abuses weak identity or permission checks in multi-agent workflows.
Computer Use Agent visual attack: GUI-based agents are manipulated through hidden or adversarial visual content.
Session context contamination: Early session inputs bias later reasoning across multi-step tasks.
MCP/plugin abuse: Tool description poisoning, server-side instruction injection, and cross-server override attacks.
Capability/architecture disclosure: Agents expose internal prompts, schemas, tools, or approval logic that attackers can weaponize.

Key red team findings

Microsoft says several patterns appeared consistently across engagements:

Human-in-the-loop bypass was one of the most frequently exploited weaknesses.
Cross-domain prompt injection remained a reliable initial access method.
Memory poisoning and XPIA often worked together to persist malicious influence.
Zero-click attack chains were demonstrated in some cases, leading to exfiltration or lateral movement.
Capability disclosure often enabled deeper exploitation by turning black-box probing into white-box attacks.

Why this matters for IT and security teams

Organizations deploying agentic AI can no longer treat these systems like standard chatbots. Agents interact with tools, memory, external services, and sometimes graphical interfaces, which creates new attack paths that traditional application security models do not fully cover.

For administrators and security teams, the update reinforces that AI governance must include supply chain controls, behavioral monitoring, and stronger trust validation across tools and agents.

Recommended next steps

This quarter, teams should consider:

Reviewing all MCP servers, plugins, and third-party agent components as part of the software supply chain.
Verifying signatures, provenance, and dependency inventories for agent-connected tools.
Testing for prompt injection, memory poisoning, and approval bypass in red team or tabletop exercises.
Limiting unnecessary disclosure of system prompts, tool schemas, and internal architecture details.
Adding monitoring that evaluates full-session behavior, not just single prompts or events.

Microsoft’s update is a useful signal for enterprises: if you are deploying agentic AI, your security model needs to evolve just as quickly as the technology.

Agentic AI Failure Modes Taxonomy Updated by Microsoft

Introduction

What’s new in the updated taxonomy

Key red team findings

Why this matters for IT and security teams

Recommended next steps

Need help with Security?

Related Posts

Microsoft Black Hat 2026: AI and Supply Chain Defense

ACR Stealer Campaigns: ClickFix Threats Rise

AI Agent Least Privilege: Identity and RBAC Guide

AsyncAPI npm Supply Chain Attack: Import-Time Malware

Defender Experts Adds Threat Intelligence and MDR

Salesforce OAuth Abuse: Microsoft Guidance on ShinyHunters