Choosing the Right AI Model

Why model selection matters

Simple explanation

Picking an AI model is like hiring for a job — you wouldn’t hire a brain surgeon to stack shelves, and you wouldn’t hire a shelf-stacker to do surgery.

Large language models (LLMs) like GPT-4o are powerful but expensive. Small language models (SLMs) like Phi-4 are cheaper and faster but less capable. Multimodal models handle images, audio, and video — not just text. And Foundry Tools are pre-built AI services you don’t need to train at all.

The exam tests whether you can match the right model to the right task — balancing cost, speed, accuracy, and capability.

The four model categories

The four model categories in Microsoft Foundry
Feature	LLMs	SLMs	Multimodal	Foundry Tools
What they do	Complex reasoning, generation, analysis	Simpler tasks, fast inference	Process text + images + audio + video	Pre-built AI capabilities (search, OCR, speech)
Examples	GPT-4o, GPT-4.1, Llama 3.3	Phi-4, Phi-4-mini, Mistral Small	GPT-4o (vision), Llama 4	Azure AI Search, Content Understanding, Translator
Cost	Higher (more tokens, more compute)	Lower (smaller, faster)	Medium-high (depends on modalities)	Pay-per-use (no model hosting)
Best for	Agents, RAG, complex workflows	Edge devices, high-volume simple tasks	Apps that need to see, hear, and read	Structured tasks: search, OCR, translation
Deployment	Cloud (Foundry hosted or serverless)	Cloud or edge	Cloud	Managed service (no deployment needed)

When to use what — decision framework

The exam loves “which model should you use?” questions. Here’s the decision tree:

Scenario	Best Choice	Why
Complex multi-step reasoning with tools	LLM (GPT-4o, GPT-4.1)	Needs strong reasoning and function-calling
Summarising thousands of support tickets	SLM (Phi-4)	Simple task at high volume — cost matters
Analysing medical images alongside patient notes	Multimodal (GPT-4o vision)	Needs to process both text and images
Extracting invoice fields from scanned PDFs	Foundry Tool (Content Understanding)	Purpose-built for document extraction
Real-time speech transcription in a call centre	Foundry Tool (Azure Speech)	Dedicated speech service, optimised for streaming
Building a chatbot that searches company docs	LLM + Foundry Tool (GPT-4o + Azure AI Search)	Combine reasoning with retrieval

Exam tip: The 'cheapest correct option' trap

The exam often presents scenarios where multiple models could work. The correct answer is usually the one that meets the requirements at the lowest cost and complexity.

For example: “A company needs to classify customer emails as positive, negative, or neutral.” You might think GPT-4o — but Phi-4 or even a Foundry sentiment analysis tool would be cheaper and sufficient. The exam rewards right-sizing, not over-engineering.

Meet the characters

Throughout this course, you’ll follow four teams building AI solutions:

Character	Who They Are	AI Use Cases
🏥 NeuralMed	Health-tech startup, 25 engineers	AI diagnostic assistants, medical record extraction, patient chatbots
🏦 Atlas Financial	Enterprise bank, 3000 employees	Compliance agents, fraud detection, customer service bots
🚀 MediaForge	Content operations platform, 40 developers	Image/video generation, marketing content pipelines, prompt optimisation
👨‍💻 Kai	AI engineer at a logistics company	Infrastructure decisions, CI/CD for AI, deployment troubleshooting

Real-world example: Kai's model selection

Kai needs to build three features for the logistics platform:

Package label OCR — reads shipping labels from photos → Content Understanding (Foundry Tool — purpose-built, no model hosting)
Route optimisation chatbot — answers complex questions about delivery routes → GPT-4o (LLM — needs reasoning over structured data)
Automated status updates — generates short “your package is on its way” messages → Phi-4-mini (SLM — simple generation, high volume, low cost)

Three features, three different model choices. That’s model selection in practice.

Foundry Tools vs models

A common exam confusion: Foundry Tools are not models you deploy — they’re managed services you call.

Foundry Tool	What It Does	When to Use Instead of a Model
Azure AI Search	Semantic, vector, and hybrid search	When you need retrieval/grounding for RAG
Content Understanding	OCR, layout analysis, field extraction from documents	When extracting structured data from PDFs, forms, images
Azure Speech	Speech-to-text, text-to-speech	When you need dedicated speech processing
Azure Translator	Text and document translation	When you need reliable multilingual translation

Exam tip: Foundry Tools vs prompting an LLM

The exam tests whether you know when to use a dedicated Foundry Tool versus prompting an LLM to do the same task. Key rule: if a Foundry Tool exists for the task, it’s usually the correct answer — it’s cheaper, more reliable, and purpose-built.

Example: “Translate a 500-page legal document from English to Japanese.” Answer: Azure Translator (Foundry Tool), NOT “prompt GPT-4o to translate.”

The model catalog and Model Router

Microsoft Foundry’s model catalog gives you access to 11,000+ models from OpenAI, Meta, Mistral, Anthropic, and more. You don’t have to use only Microsoft models.

Model Router is a deployable model in the Foundry catalog — you deploy it like any other model and call it via the Chat Completions API. It automatically selects the best underlying model for each request based on cost-performance trade-offs. Think of it as “auto-scaling for model intelligence” — simple requests get routed to cheaper models, complex ones to more capable models.