🔒 Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided AB-731 Domain 2
Domain 2 — Module 10 of 10 100%
21 of 27 overall

AB-731 Study Guide

Domain 1: Identify the Business Value of Generative AI Solutions

  • Generative AI vs Traditional AI: What's the Difference?
  • Choosing the Right AI Solution for Your Business
  • AI Models: Pretrained vs Fine-Tuned
  • AI Cost Drivers and ROI: Tokens, Pricing, and Business Cases
  • Challenges of Generative AI: Fabrications, Bias & Reliability
  • When Generative AI Creates Real Business Value
  • Prompt Engineering: The Skill That Multiplies AI Value
  • RAG and Grounding: Making AI Use YOUR Data
  • Data Quality: The Make-or-Break Factor for AI
  • When Traditional Machine Learning Adds Value
  • Securing AI Systems: From Application to Data

Domain 2: Identify Benefits, Capabilities, and Opportunities for Microsoft AI Apps and Services

  • Mapping Business Needs to Microsoft AI Solutions
  • Copilot Versions: Free, Business, M365, and Beyond
  • Copilot Chat: Web, Mobile & Work Experiences
  • Copilot in M365 Apps: Word, Excel, Teams & More
  • Copilot Studio & Microsoft Graph: Building Smarter Solutions
  • Researcher & Analyst: Copilot's Power Agents
  • Build, Buy, or Extend: The AI Decision Framework
  • Microsoft Foundry: Your AI Platform
  • Azure AI Services: Vision, Search & Beyond
  • Matching the Right AI Model to Your Business Need

Domain 3: Identify an Implementation and Adoption Strategy

  • Responsible AI and Governance: Principles That Protect Your Business Free
  • Setting Up an AI Council: Strategy, Oversight & Alignment Free
  • Building Your AI Adoption Team Free
  • AI Champions: Your Secret Weapon for Adoption Free
  • Data, Security, Privacy & Cost: The Four Pillars of AI Readiness Free
  • Copilot & Azure AI Licensing: Every Option Explained Free

AB-731 Study Guide

Domain 1: Identify the Business Value of Generative AI Solutions

  • Generative AI vs Traditional AI: What's the Difference?
  • Choosing the Right AI Solution for Your Business
  • AI Models: Pretrained vs Fine-Tuned
  • AI Cost Drivers and ROI: Tokens, Pricing, and Business Cases
  • Challenges of Generative AI: Fabrications, Bias & Reliability
  • When Generative AI Creates Real Business Value
  • Prompt Engineering: The Skill That Multiplies AI Value
  • RAG and Grounding: Making AI Use YOUR Data
  • Data Quality: The Make-or-Break Factor for AI
  • When Traditional Machine Learning Adds Value
  • Securing AI Systems: From Application to Data

Domain 2: Identify Benefits, Capabilities, and Opportunities for Microsoft AI Apps and Services

  • Mapping Business Needs to Microsoft AI Solutions
  • Copilot Versions: Free, Business, M365, and Beyond
  • Copilot Chat: Web, Mobile & Work Experiences
  • Copilot in M365 Apps: Word, Excel, Teams & More
  • Copilot Studio & Microsoft Graph: Building Smarter Solutions
  • Researcher & Analyst: Copilot's Power Agents
  • Build, Buy, or Extend: The AI Decision Framework
  • Microsoft Foundry: Your AI Platform
  • Azure AI Services: Vision, Search & Beyond
  • Matching the Right AI Model to Your Business Need

Domain 3: Identify an Implementation and Adoption Strategy

  • Responsible AI and Governance: Principles That Protect Your Business Free
  • Setting Up an AI Council: Strategy, Oversight & Alignment Free
  • Building Your AI Adoption Team Free
  • AI Champions: Your Secret Weapon for Adoption Free
  • Data, Security, Privacy & Cost: The Four Pillars of AI Readiness Free
  • Copilot & Azure AI Licensing: Every Option Explained Free
Domain 2: Identify Benefits, Capabilities, and Opportunities for Microsoft AI Apps and Services Premium ⏱ ~13 min read

Matching the Right AI Model to Your Business Need

Learn how to choose the right AI model — large or small, commercial or open-source — based on capability, cost, latency, and compliance requirements.

Not all AI models are equal

☕ Simple explanation

Choosing an AI model is like choosing a vehicle. A sports car (large model) is fast and powerful but expensive. A motorbike (small model) is cheaper and nimble but carries less. You pick the one that fits the job.

Large models like GPT-4o are powerful — they handle complex reasoning, long documents, and nuanced tasks. But they cost more and respond slower.

Small language models (SLMs) like Phi are lighter and cheaper. They handle simpler tasks brilliantly — classification, summarisation, FAQ answers — at a fraction of the cost. Some even run on phones and edge devices.

Smart organisations use BOTH: big models for hard tasks, small models for easy ones.

Model selection is a strategic decision driven by five factors:

  1. Capability: What can the model do? Complex reasoning, code generation, multilingual support, vision — different models excel at different tasks.
  2. Cost: Larger models cost more per token. At enterprise scale, the difference between GPT-4o and GPT-4o mini can be millions of dollars annually.
  3. Latency: Smaller models respond faster. For real-time applications (chatbots, search), latency matters.
  4. Data sensitivity: Some models can be deployed in isolated environments. Others are API-only. Compliance requirements may dictate deployment options.
  5. Compliance: Regulated industries may require models with specific certifications, audit capabilities, or the ability to run entirely within their infrastructure.

The model landscape in Foundry

Microsoft Foundry provides access to 1,800+ models across multiple providers. For the exam, you need to understand the key categories:

Model CategoryExamplesStrengthsBest For
OpenAI large modelsGPT-4o, GPT-4.1Complex reasoning, long context, multimodalDocument analysis, strategic planning, complex Q and A
OpenAI reasoning modelso3, o4-miniDeep multi-step reasoning, math, logicFinancial modelling, scientific analysis, complex problem-solving
OpenAI efficient modelsGPT-4o miniGood quality at lower costHigh-volume tasks: classification, summarisation, FAQ
Microsoft SLMsPhi-4, Phi-4-miniSmall, fast, deployable on edge devicesOn-device AI, mobile apps, low-latency scenarios
Meta open-sourceLlama 3.1, Llama 4Open weights, customisable, no vendor lock-inOrganisations wanting full control and transparency
MistralMistral Large, Mistral SmallEuropean-headquartered, strong multilingualEU data sovereignty requirements, multilingual tasks
Embedding modelstext-embedding-ada-002Convert text to vectors for searchRAG retrieval, semantic search, similarity matching
💡 Exam tip: Know the model categories, not specific version numbers

The exam won’t ask you to name the latest GPT version. It tests whether you understand:

  • Large vs small models — when to use each
  • Commercial vs open-source — trade-offs
  • Reasoning models — when multi-step logic is needed
  • SLMs for edge — when the model needs to run on a device, not in the cloud
  • Embedding models — used for search, not generation

Focus on the categories and decision criteria, not specific model names.

Model selection criteria

The five decision factors

FactorQuestion to AskImpact
CapabilityCan this model handle the task?Eliminates models that can’t do the job
CostHow much per token at our scale?A 10x cost difference is millions at enterprise volume
LatencyHow fast does it need to respond?Real-time apps need fast models; batch processing can wait
Data sensitivityWhere does the data go?Some models require cloud API calls; SLMs can run locally
ComplianceWhat certifications are required?Regulated industries need specific deployment options

Large models vs small models

AspectLarge Models (GPT-4o, Llama 3.1 405B)Small Models (GPT-4o mini, Phi-4)
Reasoning abilityExcellent — handles complex, multi-step tasksGood for simple to moderate tasks
Cost per tokenHigher — premium pricingLower — often 10x cheaper
Response speedSlower — more computation neededFaster — less computation
Context windowLarge (128K+ tokens)Moderate (varies by model)
Edge deploymentNo — requires cloud infrastructureYes — some run on phones and devices
Best forComplex analysis, long documents, nuanced reasoningClassification, FAQ, summarisation, high-volume tasks

The right model for common tasks

Business TaskRecommended Model SizeWhy
Answer FAQ questions from a knowledge baseSmall (GPT-4o mini, Phi)Simple retrieval + response, high volume, latency matters
Analyse a 50-page contract for legal risksLarge (GPT-4o, o3)Needs long context window and nuanced reasoning
Classify customer support tickets by urgencySmall (GPT-4o mini, Phi)Pattern matching task, high volume, speed important
Generate a strategic market analysis reportLarge (GPT-4o)Complex synthesis, reasoning, and writing quality
Real-time translation in a customer chatSmall to medium (Mistral Small)Speed critical, straightforward language task
Financial forecasting with complex variablesReasoning (o3, o4-mini)Multi-step mathematical reasoning
Question

When should you choose a small language model (SLM) over a large model?

Click or press Enter to reveal answer

Answer

Choose SLMs when: the task is straightforward (classification, FAQ, summarisation), volume is high (cost matters at scale), latency must be low (real-time responses), or the model needs to run on edge devices (phones, local hardware). SLMs are often 10x cheaper than large models.

Click to flip back

Question

What are 'reasoning models' and when should you use them?

Click or press Enter to reveal answer

Answer

Reasoning models (like o3 and o4-mini) are designed for complex, multi-step reasoning — mathematics, logic problems, financial modelling, and scientific analysis. They 'think' through problems step by step. Use them when the task requires deep logical reasoning, not just pattern matching or text generation.

Click to flip back

Open-source vs commercial models

FactorCommercial (OpenAI GPT)Open-Source (Llama, Phi, Mistral)
AccessAPI-based, managed by providerDownloadable, self-hostable
CustomisationLimited to fine-tuning via APIFull weight access, deeper customisation
Vendor lock-inTied to provider’s API and pricingRun anywhere, switch freely
SupportEnterprise support from providerCommunity support, or commercial support from hosting providers
ComplianceProvider manages complianceYOU manage compliance for self-hosted models
CostPay-per-token (predictable per-call)Infrastructure cost (predictable per-month)
ℹ️ When open-source models make sense

Open-source models like Llama and Phi are not just “free alternatives.” They offer genuine strategic advantages:

  • Data sovereignty: Self-host the model so data never leaves your environment
  • Customisation: Modify the model weights for specialised tasks
  • No vendor lock-in: Switch hosting providers without changing the model
  • Transparency: Inspect model architecture and behaviour

The trade-off: you take on the responsibility for hosting, scaling, monitoring, and compliance. Foundry simplifies this by hosting open-source models as managed endpoints.

Question

Why might an organisation choose an open-source model over a commercial one?

Click or press Enter to reveal answer

Answer

Open-source models offer: data sovereignty (self-host, data stays local), full customisation (modify model weights), no vendor lock-in (switch freely), and transparency (inspect model behaviour). The trade-off: you manage hosting, scaling, and compliance yourself. Foundry can host open-source models as managed endpoints.

Click to flip back

SLMs for edge and on-device AI

Small language models (SLMs) like Phi-4 can run directly on devices — laptops, phones, edge hardware — without cloud connectivity.

Edge Use CaseWhy SLMs WorkExample
Offline scenariosNo internet required — model runs locallyField workers in remote areas with no connectivity
Low latencyNo network round-trip — instant responseReal-time translation during face-to-face conversations
Data privacyData never leaves the deviceHealthcare notes processed locally, nothing sent to the cloud
Cost efficiencyNo per-token API chargesHigh-volume local processing at fixed infrastructure cost
IoT and manufacturingLightweight models run on industrial hardwareQuality inspection on production line edge devices
Question

What is the advantage of running an SLM on an edge device instead of calling a cloud API?

Click or press Enter to reveal answer

Answer

Edge SLMs provide: no internet dependency (works offline), zero network latency (instant response), data privacy (nothing leaves the device), and no per-token costs (fixed infrastructure cost). Ideal for remote workers, healthcare privacy, manufacturing lines, and IoT devices.

Click to flip back

📊 Dr. Patel evaluates models for a financial services client

Dr. Anisha Patel, Board Advisor, is helping a financial services firm choose AI models for three use cases.

Use case 1: Customer service chatbot

  • Volume: 50,000 queries per day
  • Requirement: Fast response, simple question answering
  • Dr. Patel’s recommendation: GPT-4o mini
  • Reasoning: High volume makes cost critical. Questions are straightforward. Fast response time improves customer experience. Using GPT-4o would cost 10x more with minimal quality improvement.

Use case 2: Regulatory compliance analysis

  • Volume: 200 documents per week
  • Requirement: Analyse complex regulations, identify compliance gaps
  • Dr. Patel’s recommendation: GPT-4o or o3
  • Reasoning: Complex, nuanced reasoning required. Long documents need a large context window. Accuracy is more important than speed or cost. Regulatory mistakes have serious consequences.

Use case 3: Fraud detection pattern recognition

  • Volume: Continuous real-time analysis
  • Requirement: Low latency, data sovereignty, no cloud dependency
  • Dr. Patel’s recommendation: Phi-4 on edge infrastructure
  • Reasoning: Real-time processing needs minimal latency. Transaction data must stay on-premises. SLM handles pattern matching efficiently. No per-token costs for continuous analysis.
💡 The multi-model strategy

Notice that Dr. Patel recommended three different models for three different use cases within the same company. This is the multi-model strategy the exam expects you to understand:

  • Use the cheapest model that meets quality requirements
  • Match model size to task complexity
  • Consider deployment constraints (cloud, edge, on-premise)
  • Use Foundry to deploy and manage multiple models from a single platform

The exam rewards answers that choose the right-sized model, not the biggest or most impressive one.

Knowledge Check

Tomás's customer support team at PacificSteel processes 100,000 emails daily and needs AI to classify each email by topic and urgency. Which model approach is most cost-effective?

Knowledge Check

Dr. Patel recommends Phi-4 on edge hardware for a fraud detection system. What is the PRIMARY reason for choosing an edge-deployed SLM?


🎬 Video coming soon

Congratulations! You’ve completed Domain 2: Identify Benefits, Capabilities, and Opportunities for Microsoft AI Apps and Services. You now understand how to map business needs to AI solutions, compare Copilot versions, and choose the right AI models and platforms.

Next up: Responsible AI and Governance — start Domain 3 by learning the principles that keep your AI deployments safe and ethical.

← Previous

Azure AI Services: Vision, Search & Beyond

Next →

Responsible AI and Governance: Principles That Protect Your Business

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.