πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided AI-901 Domain 1
Domain 1 β€” Module 4 of 11 36%
4 of 26 overall

AI-901 Study Guide

Domain 1: AI Concepts and Capabilities

  • What is AI? Your First 10 Minutes Free
  • Responsible AI: The Six Principles Free
  • How Generative AI Actually Works Free
  • Choosing the Right AI Model Free
  • Deploying AI Models: Options & Settings
  • AI Workloads at a Glance
  • Text Analysis: Keywords, Entities & Sentiment
  • Speech: Recognition & Synthesis
  • Computer Vision: Seeing the World
  • Image Generation: Creating with AI
  • Information Extraction: From Chaos to Structure

Domain 2: Implement AI Solutions Using Foundry

  • Prompting Fundamentals: System & User Prompts
  • Microsoft Foundry: Your AI Command Center Free
  • Building a Chat App with the Foundry SDK
  • Agents in Foundry: Create & Test
  • Building an Agent Client App
  • Building a Text Analysis App
  • Multimodal: Responding to Speech
  • Azure Speech in Foundry Tools
  • Visual Prompts: Images as Input
  • Generating Images with AI
  • Building a Vision App
  • Content Understanding: Documents & Forms
  • Multimodal Extraction: Images, Audio & Video
  • Building an Extraction App
  • Exam Prep: Putting It All Together

AI-901 Study Guide

Domain 1: AI Concepts and Capabilities

  • What is AI? Your First 10 Minutes Free
  • Responsible AI: The Six Principles Free
  • How Generative AI Actually Works Free
  • Choosing the Right AI Model Free
  • Deploying AI Models: Options & Settings
  • AI Workloads at a Glance
  • Text Analysis: Keywords, Entities & Sentiment
  • Speech: Recognition & Synthesis
  • Computer Vision: Seeing the World
  • Image Generation: Creating with AI
  • Information Extraction: From Chaos to Structure

Domain 2: Implement AI Solutions Using Foundry

  • Prompting Fundamentals: System & User Prompts
  • Microsoft Foundry: Your AI Command Center Free
  • Building a Chat App with the Foundry SDK
  • Agents in Foundry: Create & Test
  • Building an Agent Client App
  • Building a Text Analysis App
  • Multimodal: Responding to Speech
  • Azure Speech in Foundry Tools
  • Visual Prompts: Images as Input
  • Generating Images with AI
  • Building a Vision App
  • Content Understanding: Documents & Forms
  • Multimodal Extraction: Images, Audio & Video
  • Building an Extraction App
  • Exam Prep: Putting It All Together
Domain 1: AI Concepts and Capabilities Free ⏱ ~12 min read

Choosing the Right AI Model

Not all AI models are the same. Some are great at text, others at images, others at code. This module teaches you how to pick the right model for the job β€” a key exam skill.

Why model choice matters

β˜• Simple explanation

Picking an AI model is like choosing the right tool from a toolbox.

You wouldn’t use a hammer to cut wood, and you wouldn’t use a saw to drive a nail. AI models work the same way β€” each one is designed for specific tasks. A text model excels at writing, a vision model excels at understanding images, and a speech model excels at converting voice to text.

The exam tests your ability to match the right model to the right scenario.

AI models are trained on different types of data for different tasks. Choosing the right model depends on several factors: the modality of your input/output (text, image, audio), the complexity of the task, the latency requirements, the cost budget, and the accuracy needed.

Microsoft Foundry provides a model catalog with hundreds of models from different providers. Understanding model categories and capabilities is essential for making informed deployment decisions.

Model categories

CategoryWhat They DoExamplesBest For
Large Language Models (LLMs)Generate and understand textGPT-4o, GPT-4, Phi-4Chat, summarisation, translation, code
Small Language Models (SLMs)Text tasks with lower cost/latencyPhi-4-mini, Phi-3-smallSimple tasks, edge devices, cost-sensitive apps
Image generation modelsCreate images from text descriptionsGPT-image-1.5Marketing visuals, concept art, design
Vision modelsAnalyse and understand imagesGPT-4o (vision), FlorenceImage classification, object detection, OCR
Speech modelsConvert speech ↔ textAzure Speech ServiceTranscription, voice assistants, TTS
Embedding modelsConvert text to numerical vectorstext-embedding-ada-002Search, similarity, RAG retrieval

How to choose: the decision framework

When the exam gives you a scenario, use this framework:

Step 1: What type of input/output do you need?

  • Text in, text out β†’ LLM
  • Text in, image out β†’ Image generation
  • Image in, text out β†’ Vision model
  • Audio in, text out β†’ Speech model
  • Multiple types β†’ Multimodal model (GPT-4o)

Step 2: What’s the complexity?

  • Simple task (classify, extract) β†’ Smaller, cheaper model
  • Complex task (reason, create) β†’ Larger, more capable model

Step 3: What are your constraints?

  • Low budget β†’ SLM (Phi-4-mini)
  • Low latency β†’ SLM or smaller LLM
  • Highest quality β†’ GPT-4o or GPT-4
  • Privacy-sensitive β†’ On-device or edge model
Model selection guide β€” matching scenarios to models
FeatureWhen to UseModel Choice
Chat assistant for customersNeed natural conversation, reasoningGPT-4o or GPT-4
Summarise meeting notesText in, text out, moderate complexityGPT-4o-mini or Phi-4
Generate product imagesText description β†’ new imageGPT-image-1.5
Classify support ticketsSimple text classificationPhi-4-mini (cost-efficient)
Transcribe phone callsAudio β†’ textAzure Speech Service
Analyse medical X-raysImage understanding + reasoningGPT-4o (multimodal)
Search company documentsNeed to find relevant passagesEmbedding model + RAG

Large vs small models

Large language models vs small language models
FeatureLarge Models (GPT-4o)Small Models (Phi-4-mini)
ParametersHundreds of billionsBillions (10x-100x smaller)
CapabilityBroad, complex reasoningFocused, specific tasks
CostHigher per-token pricingSignificantly cheaper
LatencySlower (more computation)Faster responses
Best forComplex tasks, multimodal, creativeClassification, extraction, simple chat
Can run on edge?No β€” cloud onlyYes β€” can run on devices
ℹ️ Microsoft's Phi family β€” small but mighty

Microsoft developed the Phi family of small language models specifically for scenarios where cost, latency, or deployment location matters more than maximum capability.

  • Phi-4 β€” latest, most capable small model
  • Phi-4-mini β€” even smaller, great for classification and extraction
  • Phi-3 β€” previous generation, still widely deployed

The key insight: for many business tasks (email classification, FAQ answers, data extraction), a small model performs nearly as well as GPT-4o at a fraction of the cost.

Exam relevance: When a scenario mentions β€œcost-effective” or β€œedge deployment” or β€œlow latency” β†’ think Phi or other SLMs.

The Foundry model catalog

Microsoft Foundry includes a model catalog β€” a library of models from multiple providers that you can deploy directly:

ProviderModelsStrengths
OpenAIGPT-4o, GPT-4, GPT-image-1.5Best general-purpose, multimodal
MicrosoftPhi-4, Phi-4-miniCost-efficient, edge-friendly
MetaLlama 3Open-source, customisable
MistralMistral Large, Mistral SmallEuropean alternative, efficient
CohereCommand RStrong at RAG and retrieval

Key exam concept: You don’t need to memorise every model. You need to understand the categories (LLM, SLM, vision, speech, embedding) and know how to choose based on task requirements.

🎬 Video walkthrough

🎬 Video coming soon

Choosing the Right AI Model β€” AI-901 Module 4

Choosing the Right AI Model β€” AI-901 Module 4

~12 min

Flashcards

Question

What is the difference between a Large Language Model (LLM) and a Small Language Model (SLM)?

Click or press Enter to reveal answer

Answer

LLMs have hundreds of billions of parameters and excel at complex reasoning and multimodal tasks but cost more and are slower. SLMs have billions of parameters (10-100x smaller), are cheaper and faster, and work well for focused tasks like classification and extraction.

Click to flip back

Question

When should you choose a small model (like Phi-4-mini) over GPT-4o?

Click or press Enter to reveal answer

Answer

When the task is simple (classification, extraction, FAQ), when cost is a concern, when low latency is required, or when you need to run the model on edge devices.

Click to flip back

Question

What is an embedding model used for?

Click or press Enter to reveal answer

Answer

Converting text into numerical vectors (lists of numbers) that capture semantic meaning. Used for search, document similarity, and RAG retrieval β€” finding relevant documents to feed to an LLM.

Click to flip back

Question

What model would you use to generate images from text descriptions?

Click or press Enter to reveal answer

Answer

GPT-image-1.5 β€” an image generation model available in Microsoft Foundry. You provide a text prompt, and it creates a new image matching that description.

Click to flip back

Knowledge Check

Knowledge Check

GreenLeaf wants to automatically classify incoming support emails into categories: billing, technical, general inquiry. They process 50,000 emails per day and need to keep costs low. Which model approach is most appropriate?

Knowledge Check

MediSpark needs an AI model that can accept both a medical image (X-ray) and a text question ('What abnormalities are visible?') and return a text response. Which type of model do they need?

Knowledge Check

Priya needs to build a search feature that finds the most relevant company documents when a user types a question. Which combination of models should she use?


Next up: Deploying AI Models β€” configuration parameters like temperature, top-p, and max tokens that control how your model behaves.

← Previous

How Generative AI Actually Works

Next β†’

Deploying AI Models: Options & Settings

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.