Azure

Microsoft Foundry: Managing AI Models, Cost, Quality

3 min read

Summary

Microsoft Foundry is expanding its model ecosystem and operational tooling to help developers manage AI applications across selection, evaluation, optimization, and production operations. The update includes general availability of Fireworks AI on Microsoft Foundry, giving teams more model choice through a single Azure endpoint while improving cost control, governance, and lifecycle management.

Need help with Azure?Talk to an Expert

Introduction

Building an AI prototype is easier than ever, but running AI in production is a different challenge. Microsoft Foundry is positioning itself as a unified Azure platform for selecting, evaluating, optimizing, and operating AI models at scale—helping teams balance quality, latency, safety, and cost.

What’s new in Microsoft Foundry

The latest update focuses on model choice and operational discipline for production AI workloads.

  • Fireworks AI on Microsoft Foundry is now generally available
    • Developers can access production-grade open model inference through a single Azure endpoint.
    • The service includes enterprise SLAs and zero-setup onboarding.
  • Expanded model ecosystem
    • Foundry now supports a broader mix of Microsoft AI models, partner models, open-source models, custom models, and post-trained variants.
  • Model-agnostic operations
    • Teams can use one workflow for selection, evaluation, deployment, and monitoring instead of stitching together separate tools.
  • Model Router support
    • Foundry can automatically route requests to the most appropriate model based on workload type, cost targets, and latency requirements.

Why this matters for Azure teams

For IT and platform administrators supporting AI projects, the challenge is no longer just model access. The real issue is operating AI systems reliably in production.

Microsoft Foundry addresses common enterprise concerns:

  • Reducing vendor lock-in by supporting multiple model providers
  • Improving governance with repeatable evaluation and monitoring
  • Controlling costs through routing, batching, caching, and provisioned throughput
  • Supporting quality and safety validation using custom and built-in evaluators

This is especially relevant for RAG copilots, agentic workflows, and business process automation where performance, groundedness, and policy compliance matter as much as model capability.

Key operational takeaways

Organizations should treat model selection as an ongoing process, not a one-time decision.

  • Define success criteria before choosing a model
  • Test models with your own prompts, data, and expected outcomes
  • Evaluate for quality, safety, latency, throughput, and cost
  • Reassess continuously as new model versions and pricing options appear

Next steps

Azure teams evaluating Microsoft Foundry should:

  1. Review whether current AI workloads need multi-model routing
  2. Build custom evaluation datasets in CSV or JSONL
  3. Identify workloads that can benefit from cost optimization features like caching or batching
  4. Assess whether Fireworks AI on Foundry fits open-model production needs under Azure governance

For organizations scaling AI beyond pilot projects, Microsoft Foundry is becoming a key platform for operationalizing AI responsibly and efficiently on Azure.

Need help with Azure?

Our experts can help you implement and optimize your Microsoft solutions.

Talk to an Expert

Stay updated on Microsoft technologies

Microsoft FoundryAzure AIFireworks AIAI model managementmodel evaluation

Related Posts

Azure

Azure Cobalt 200 VMs Boost Agentic AI Performance

Microsoft has announced early access preview for Azure Cobalt 200 Arm-based VMs, delivering up to 50% better generational CPU performance than Cobalt 100 for cloud-native, Linux-based, and agentic AI workloads. The new VMs add higher storage and networking performance, scale to 128 vCPUs, and enable memory encryption by default, making them important for organizations optimizing AI inferencing, data pipelines, and modern web services.

Azure

Azure Foundry IQ Adds Serverless Retrieval and MCP

Microsoft has expanded Azure Foundry IQ with serverless retrieval in public preview, new multi-source knowledge connectors, and generally available knowledge bases for production agent workloads. The updates help developers build and scale grounded AI agents faster while improving security, retrieval quality, and access to both enterprise and web data.

Azure

Microsoft Discovery GA: R&D AI Platform and App Preview

Microsoft has made Microsoft Discovery generally available as a production-ready platform for building and governing agentic AI workflows in scientific and engineering research. It also introduced the Microsoft Discovery app in preview, giving researchers and academic teams a simpler local entry point before moving to enterprise-scale deployments.

Azure

Azure AI Agent Platform: Microsoft’s Enterprise Vision

Microsoft outlined its broader Azure-led strategy for enterprise AI, arguing that successful adoption depends on a governed, integrated system around agents rather than standalone models or chatbots. The company is positioning Azure, GitHub, Microsoft IQ, Foundry, Microsoft 365, and Security tools as a unified platform to build, run, govern, and continuously improve AI agents at scale.

Azure

Microsoft Build 2026: Fabric and Databases for AI Apps

At Microsoft Build 2026, Microsoft introduced new data platform capabilities aimed at helping developers move AI and agent-based apps from prototype to production. Key announcements include Rayfin for building Fabric-backed app backends, Azure HorizonDB in public preview for AI-ready PostgreSQL workloads, and new security and migration tools for Azure Database for PostgreSQL.

Azure

Claude Opus 4.8 in Microsoft Foundry Now Available

Microsoft Foundry now includes Anthropic Claude Opus 4.8, giving developers and enterprises access to a stronger model for coding, agentic workflows, and document-heavy analysis. The release matters because it expands model choice in Foundry while helping teams build and evaluate advanced AI applications with enterprise controls.