Azure

Azure Storage 2026:AI 训练、推理与关键业务存储

3分钟阅读

摘要

微软公布了 Azure Storage 2026 路线图,重点围绕 AI 训练、调优与推理的数据通路升级,包括更大规模的 Blob 存储、与 NVIDIA DGX on Azure 搭配的 Azure Managed Lustre,以及与 Foundry、Ray、LangChain 等生态的更深集成。 这意味着企业不仅能更高效地支撑大模型与 agent 应用的持续高并发需求,也能在 SAP、Kubernetes 有状态应用和超低延迟关键业务场景中获得更好的性能、治理能力与成本效率。

需要Azure方面的帮助?咨询专家

Introduction: why this matters

AI 正从偶发式实验走向始终在线的生产环境——尤其是推理与自主“agentic”工作负载,会带来持续、高并发的访问模式。Azure Storage 的 2026 路线图聚焦于打通端到端的 AI 数据流(训练 → 调优 → 推理),同时提升传统任务关键型系统(如 SAP)以及超低延迟交易平台的成本、运维简化与性能。

What’s new (and what Microsoft is emphasizing)

1) Training at frontier scale: Blob and high-throughput data paths

  • Blob scaled accounts 被强调为一种可在每个区域横向扩展到 数百个 scale units 的方式,面向包含 数百万对象 的工作负载(在训练/调优数据集以及 checkpoint/模型文件管理中很常见)。
  • Microsoft 指出,用于支撑 OpenAI 规模运营的创新正在更广泛地向企业开放。

2) Purpose-built storage for AI compute: Azure Managed Lustre (AMLFS)

  • Azure 与 NVIDIA DGX on Azure 的合作,将加速计算与 Azure Managed Lustre 配对,以持续为 GPU 集群供给数据。
  • AMLFS 现已包含对 25 PiB namespaces 的预览支持,并提供最高 512 GBps 吞吐量,定位为面向大型研究与工业级推理场景(如汽车、机器人)的顶级托管 Lustre 方案。

3) AI ecosystem integrations: faster paths from data to inference

  • 计划在 AI 框架生态上进行更深度的集成,包括 Microsoft Foundry、Ray/Anyscale 和 LangChain
  • Foundry 内的原生 Azure Blob 集成 被定位为帮助将企业数据整合进 Foundry IQ,用于知识 grounding、fine-tuning 与低延迟的上下文服务,同时将治理与安全控制保持在租户内。

4) Agentic scale cloud-native apps: block storage + Kubernetes orchestration

  • Microsoft 指出,agent 可能会比人驱动的应用产生 高一个数量级的查询量,从而对存储/数据库层造成压力。
  • Elastic SAN 被描述为面向 SaaS 风格、多租户架构的核心构建块,提供托管的块存储池与防护机制(guardrails)。
  • Azure Container Storage (ACStor) 的方向将转向 Kubernetes operator model,并计划在 CSI drivers 之外,表达出 将代码库开源 的意图,以简化 Kubernetes 上有状态应用的开发。

5) Mission-critical price/performance: SAP, ANF, Ultra Disk

  • 针对 SAP HANA,Azure 的 M-series 更新将磁盘性能目标提升到约 780k IOPS16 GB/s 吞吐量
  • Azure NetApp Files (ANF)Azure Premium Files 仍是核心共享存储选项,并通过 ANF Flexible Service LevelAzure Files Provisioned v2 等举措改善 TCO。
  • 即将推出:ANF 的 Elastic ZRS service level,通过跨 AZ 的同步复制实现 zone-redundant HA。
  • Ultra Disk 的性能被重点强调(亚 500µs 延迟;最高 400K IOPS/10 GB/s,并可在 Ebsv6 VMs 上提升至最高 800K IOPS/14 GB/s)。

Impact on IT admins and platform teams

  • 对于以推理与 agentic 为主的应用,预计架构层面将更强调 吞吐量、并发与数据本地性
  • Kubernetes operators 以及潜在的开源 ACStor 可能会改变团队在 AKS 上标准化有状态工作负载的方式。
  • 存储选型将更贴合具体工作负载:Blob 用于数据集/上下文,Lustre 用于 GPU pipeline,Elastic SAN/Ultra Disk 用于高 IOPS 事务性需求,ANF 用于共享型企业工作负载。

Action items / next steps

  1. 按阶段梳理 AI 工作负载(训练 vs 推理 vs agentic),并与存储类型对齐(Blob + AMLFS + block/shared)。
  2. 审视 AMLFS 预览限制(25 PiB/512 GBps),并验证 Lustre 可改善的 GPU pipeline 瓶颈。
  3. 评估 Elastic SAN:适用于多租户 SaaS 或需要池化块存储的高并发微服务。
  4. 如需为企业应用提供性能一致的、zone-redundant 的 NFS,请 规划 ANF Elastic ZRS
  5. 对于 AKS 团队,跟踪 ACStor operator + open-source 的更新,以减少定制化的有状态存储管理。

需要Azure方面的帮助?

我们的专家可以帮助您实施和优化Microsoft解决方案。

咨询专家

获取微软技术最新资讯

Azure StorageAzure Blob StorageAzure Managed LustreAKSElastic SAN

相关文章

Azure

Microsoft The Shift Podcast on Agentic AI Challenges

Microsoft has launched a new season of The Shift podcast focused on agentic AI, with eight weekly episodes exploring how AI agents use data, coordinate with each other, and depend on platforms like Postgres, Microsoft Fabric, and OneLake. The series matters because it highlights that deploying agents in enterprises is not just about models—it requires rethinking architecture, governance, security, and IT workflows across the full Azure and data stack.

Azure

Azure Agentic AI for Regulated Industry Modernization

Microsoft says Azure combined with agentic AI can help regulated industries modernize legacy systems faster by automating workload assessment, migration, and ongoing operations while maintaining compliance. The update matters because it positions cloud migration as more than a cost-saving exercise: for sectors like healthcare and other highly regulated industries, it is increasingly essential for resilience, governance, and readiness to deploy AI at scale.

Azure

Fireworks AI on Microsoft Foundry for Azure Inference

Microsoft has launched a public preview of Fireworks AI on Microsoft Foundry, bringing high-throughput, low-latency open-model inference to Azure through a single managed endpoint. It matters because enterprises can now access models like DeepSeek V3.2, gpt-oss-120b, Kimi K2.5, and MiniMax M2.5 with Azure’s governance, serverless or provisioned deployment options, and bring-your-own-weights support—making it easier to move open-model AI from experimentation into production.

Azure

Azure Copilot Migration Agent for App Modernization

Microsoft has introduced new public preview modernization agents in Azure Copilot and GitHub Copilot to help organizations automate migration and application transformation across discovery, assessment, planning, deployment, and code upgrades. The announcement matters because it aims to turn complex, fragmented modernization work into a coordinated AI-assisted workflow, helping enterprises move legacy infrastructure and applications to Azure faster and with clearer cost, dependency, and prioritization insights.

Azure

Azure IaaS Resource Center for Resilient Infrastructure

Microsoft has introduced the Azure IaaS Resource Center, a centralized hub for infrastructure teams to find design guidance, demos, architecture resources, and best practices for compute, storage, and networking. The launch matters because it reinforces Azure IaaS as a unified platform for building resilient, high-performance, and cost-optimized infrastructure, helping organizations better support everything from traditional business apps to AI workloads.

Azure

Microsoft Foundry ROI Study Shows 327% Enterprise AI Gains

A Forrester Total Economic Impact study commissioned around Microsoft Foundry found that a modeled enterprise could achieve 327% ROI over three years, break even in about six months, and realize $49.5 million in benefits from productivity and infrastructure savings. The results matter because they highlight how much enterprise AI costs are driven by developer time and fragmented tooling, suggesting that a unified platform like Foundry can help IT teams accelerate AI delivery while improving governance and efficiency.