Azure

Fireworks AI on Microsoft Foundry for Azure Inference

3分钟阅读

摘要

Microsoft has launched a public preview of Fireworks AI on Microsoft Foundry, bringing high-throughput, low-latency open-model inference to Azure through a single managed endpoint. It matters because enterprises can now access models like DeepSeek V3.2, gpt-oss-120b, Kimi K2.5, and MiniMax M2.5 with Azure’s governance, serverless or provisioned deployment options, and bring-your-own-weights support—making it easier to move open-model AI from experimentation into production.

需要Azure方面的帮助?咨询专家

Fireworks AI arrives on Microsoft Foundry

Introduction

Organizations adopting open models want more than raw performance—they need a practical way to run those models securely, govern them consistently, and move from testing to production without stitching together multiple tools. Microsoft’s new public preview of Fireworks AI on Microsoft Foundry is aimed at solving that problem by combining fast open-model inference with Azure’s enterprise management and governance capabilities.

What’s new

Microsoft Foundry now includes Fireworks AI as a public preview option for open model inference in Azure. The announcement positions Foundry as a centralized control plane for the full AI lifecycle, including model evaluation, deployment, customization, and operations.

Key updates include:

  • Public preview of Fireworks AI on Microsoft Foundry for high-throughput, low-latency open model inference
  • Access to supported open models through a single Azure endpoint in Foundry
  • Support for these models today:
    • DeepSeek V3.2
    • OpenAI gpt-oss-120b
    • Kimi K2.5
    • MiniMax M2.5
  • MiniMax M2.5 is newly added to Foundry with serverless support
  • Bring-your-own-weights (BYOW) support for quantized or fine-tuned models trained elsewhere
  • Deployment flexibility with:
    • Serverless, pay-per-token inference for rapid experimentation
    • Provisioned Throughput Units (PTUs) for predictable production performance

Microsoft also highlighted Fireworks AI’s large-scale inference capabilities, including internet-scale token processing and benchmark-leading throughput for open models.

Why this matters for IT and platform teams

For Azure administrators, AI platform teams, and enterprise architects, this reduces the operational complexity of supporting open models. Instead of building separate serving stacks or governance frameworks, teams can use Foundry as a single environment for model access, deployment, observability, and policy control.

This is especially relevant for organizations that want to:

  • Standardize on open models without vendor lock-in
  • Support custom fine-tuned models while keeping a consistent serving platform
  • Balance cost and performance across experimentation and production workloads
  • Apply enterprise governance and security controls to AI deployments in Azure

Admins and AI teams should:

  1. Review the Microsoft Foundry model catalog for Fireworks-hosted models.
  2. Evaluate whether serverless or PTU-based deployments best fit workload requirements.
  3. Test BYOW scenarios if your organization already has fine-tuned or quantized open models.
  4. Validate governance, observability, and operational requirements before production rollout.
  5. Track Microsoft’s additional guidance on model customization and lifecycle management in Foundry.

Fireworks AI on Microsoft Foundry gives Azure customers a stronger path to operationalizing open models at scale—without sacrificing performance, flexibility, or enterprise control.

需要Azure方面的帮助?

我们的专家可以帮助您实施和优化Microsoft解决方案。

咨询专家

获取微软技术最新资讯

AzureMicrosoft FoundryFireworks AIopen modelsAI inference

相关文章

Azure

Microsoft The Shift Podcast on Agentic AI Challenges

Microsoft has launched a new season of The Shift podcast focused on agentic AI, with eight weekly episodes exploring how AI agents use data, coordinate with each other, and depend on platforms like Postgres, Microsoft Fabric, and OneLake. The series matters because it highlights that deploying agents in enterprises is not just about models—it requires rethinking architecture, governance, security, and IT workflows across the full Azure and data stack.

Azure

Azure Agentic AI for Regulated Industry Modernization

Microsoft says Azure combined with agentic AI can help regulated industries modernize legacy systems faster by automating workload assessment, migration, and ongoing operations while maintaining compliance. The update matters because it positions cloud migration as more than a cost-saving exercise: for sectors like healthcare and other highly regulated industries, it is increasingly essential for resilience, governance, and readiness to deploy AI at scale.

Azure

Azure Copilot Migration Agent for App Modernization

Microsoft has introduced new public preview modernization agents in Azure Copilot and GitHub Copilot to help organizations automate migration and application transformation across discovery, assessment, planning, deployment, and code upgrades. The announcement matters because it aims to turn complex, fragmented modernization work into a coordinated AI-assisted workflow, helping enterprises move legacy infrastructure and applications to Azure faster and with clearer cost, dependency, and prioritization insights.

Azure

Azure IaaS Resource Center for Resilient Infrastructure

Microsoft has introduced the Azure IaaS Resource Center, a centralized hub for infrastructure teams to find design guidance, demos, architecture resources, and best practices for compute, storage, and networking. The launch matters because it reinforces Azure IaaS as a unified platform for building resilient, high-performance, and cost-optimized infrastructure, helping organizations better support everything from traditional business apps to AI workloads.

Azure

Microsoft Foundry ROI Study Shows 327% Enterprise AI Gains

A Forrester Total Economic Impact study commissioned around Microsoft Foundry found that a modeled enterprise could achieve 327% ROI over three years, break even in about six months, and realize $49.5 million in benefits from productivity and infrastructure savings. The results matter because they highlight how much enterprise AI costs are driven by developer time and fragmented tooling, suggesting that a unified platform like Foundry can help IT teams accelerate AI delivery while improving governance and efficiency.

Azure

Microsoft Foundry GPT-5.4 for Enterprise AI Workloads

Microsoft has introduced GPT-5.4 in Microsoft Foundry, positioning it as a production-focused AI model for enterprise workloads that need stronger instruction following, longer context handling, faster latency, and more reliable tool and file orchestration. The update matters because it moves AI agents closer to dependable real-world business automation, while the new GPT-5.4 Pro variant targets complex analytical and decision-heavy workflows that demand greater stability and completeness.