Table of Contents
Microsoft consolidated Azure ML + Azure OpenAI into Azure AI Foundry by 2025-26. Provides one-stop access to OpenAI models + Llama / Mistral via Azure ML + custom fine-tuning. Self-hosted dedicated GPU competes on cost and customisation.
Azure AI Foundry wins for: Azure-native shops, GPT-4o / o1 access, integrated with Azure data products. Self-hosted wins for: cost at scale, residency outside Azure regions, full customisation. Hybrid: Foundry for frontier + GPT-4o; self-hosted for bulk Llama / Mistral traffic. Common UK enterprise pattern.
Comparison
| Aspect | Azure AI Foundry | Self-hosted |
|---|---|---|
| Frontier (GPT-4o, o1) | Yes | No |
| Open-weight (Llama, Mistral) | Yes (per-token) | Yes (cost-anchored) |
| Cost at scale | Higher | Lower |
| Custom fine-tuning | Per-model limits | Full |
| Data residency | Azure regions | Anywhere |
| Ops burden | Lower | Higher |
| Azure integration | Native | External |
When each
- Azure AI Foundry: Azure-stack organisations, GPT-4o / o1 access required, integrated Azure data tooling
- Self-hosted: cost-anchored at scale, residency / sovereignty requirement, custom fine-tuning needs
- Hybrid: most enterprise — Azure for frontier + GPT-4o; self-hosted for bulk Llama / Mistral / Qwen workloads
Verdict
For UK / EU enterprises with regulated data and Azure-stack alignment, hybrid (Foundry for frontier + self-hosted for bulk) is increasingly the right pattern. Foundry alone is fine for ops-constrained teams; self-hosted alone wins on cost / customisation when scale justifies the ops investment.
Bottom line
Hybrid for UK enterprise. See Azure migration.