πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided AI-103 Domain 1
Domain 1 β€” Module 4 of 8 50%
4 of 27 overall

AI-103 Study Guide

Domain 1: Plan and Manage an Azure AI Solution

  • Choosing the Right AI Model Free
  • Foundry Services: Your AI Toolkit Free
  • Retrieval, Indexing & Agent Memory
  • Designing AI Infrastructure
  • Deploying Models & CI/CD
  • Quotas, Scaling & Cost
  • Monitoring & Security
  • Responsible AI: Filters, Auditing & Governance

Domain 2: Implement Generative AI and Agentic Solutions

  • Connecting Your App to Foundry Free
  • Building RAG Applications
  • Workflows & Reasoning Pipelines
  • Evaluating AI Models & Apps
  • Agent Fundamentals: Roles, Goals & Tools Free
  • Building Agents with Retrieval & Memory
  • Agent Tools & Knowledge Integration
  • Multi-Agent Orchestration & Safeguards
  • Agent Monitoring & Error Analysis
  • Prompt Engineering & Model Tuning
  • Observability & Production Operations

Domain 3: Implement Computer Vision Solutions

  • Image & Video Generation
  • Multimodal Visual Understanding
  • Responsible AI for Visual Content

Domain 4: Implement Text Analysis Solutions

  • Text Analysis with Language Models
  • Speech, Translation & Voice Agents

Domain 5: Implement Information Extraction Solutions

  • Ingestion, Indexing & Grounding Pipelines
  • Extracting Content with Content Understanding
  • Exam Prep: Putting It All Together

AI-103 Study Guide

Domain 1: Plan and Manage an Azure AI Solution

  • Choosing the Right AI Model Free
  • Foundry Services: Your AI Toolkit Free
  • Retrieval, Indexing & Agent Memory
  • Designing AI Infrastructure
  • Deploying Models & CI/CD
  • Quotas, Scaling & Cost
  • Monitoring & Security
  • Responsible AI: Filters, Auditing & Governance

Domain 2: Implement Generative AI and Agentic Solutions

  • Connecting Your App to Foundry Free
  • Building RAG Applications
  • Workflows & Reasoning Pipelines
  • Evaluating AI Models & Apps
  • Agent Fundamentals: Roles, Goals & Tools Free
  • Building Agents with Retrieval & Memory
  • Agent Tools & Knowledge Integration
  • Multi-Agent Orchestration & Safeguards
  • Agent Monitoring & Error Analysis
  • Prompt Engineering & Model Tuning
  • Observability & Production Operations

Domain 3: Implement Computer Vision Solutions

  • Image & Video Generation
  • Multimodal Visual Understanding
  • Responsible AI for Visual Content

Domain 4: Implement Text Analysis Solutions

  • Text Analysis with Language Models
  • Speech, Translation & Voice Agents

Domain 5: Implement Information Extraction Solutions

  • Ingestion, Indexing & Grounding Pipelines
  • Extracting Content with Content Understanding
  • Exam Prep: Putting It All Together
Domain 1: Plan and Manage an Azure AI Solution Premium ⏱ ~12 min read

Designing AI Infrastructure

Before you write a line of code, you need the right Azure infrastructure. Learn how to design the foundation for AI apps and agents β€” regions, networking, resource topology, and deployment options.

Planning your AI infrastructure

β˜• Simple explanation

Building AI infrastructure is like setting up a restaurant kitchen before opening night.

You need to decide: Where will the kitchen be? (region) How many ovens do you need? (compute) Should the kitchen be open to walk-ins or reservation-only? (networking) Can you share equipment between branches? (resource topology)

Get these decisions wrong and you’ll spend more, go slower, or fail compliance audits. Get them right and everything else flows smoothly.

Designing Azure infrastructure for AI solutions requires planning across four dimensions:

  • Region selection β€” model availability, data residency, latency
  • Resource topology β€” how Foundry Projects, Search, Storage, and networking connect
  • Deployment options β€” serverless vs provisioned, managed compute vs self-hosted
  • Security perimeter β€” VNets, private endpoints, managed identity, RBAC

The exam tests architectural trade-offs, not deployment commands. You need to know why to pick each option.

Region selection

Not all Azure regions offer the same AI services. Your region choice affects:

FactorImpactExample
Model availabilityNot all models are in all regionsGPT-4o may be available in East US but not Australia East
Data residencyRegulated industries require data to stay in specific geographiesEU healthcare data must stay in EU regions
LatencyCloser regions = faster responsesAn app serving users in Asia should use an Asia-Pacific region
CapacityPopular regions may have longer queue timesEast US 2 may have shorter wait times than East US
πŸ’‘ Exam tip: Region + model availability

The exam may present a scenario where the correct answer depends on model availability in a specific region. Key rule: always check model availability before choosing a region. A region that meets data residency requirements but doesn’t offer your required model is not a valid choice.

Deployment options

Serverless vs provisioned deployment
FeatureServerless (Pay-per-token)Provisioned Throughput
How it worksPay only for tokens consumedReserve fixed compute capacity (TPM)
Cost modelVariable β€” scales with usageFixed β€” predictable monthly cost
Best forDevelopment, variable workloads, prototypingProduction with predictable, high-volume traffic
LatencyMay queue during peak timesGuaranteed capacity, consistent latency
Rate limitsShared pool, may be throttledDedicated capacity, higher limits
SetupDeploy model, start callingReserve capacity, then deploy

Other deployment patterns

PatternWhen to Use
Managed computeDefault for most scenarios β€” Foundry manages the infrastructure
Connected compute (self-hosted)When you need models on your own VMs or Kubernetes
Edge deploymentSLMs on IoT devices or local servers (Phi-4-mini on ONNX)
Global deploymentRoute requests across regions for availability and latency

Resource topology

A typical AI solution connects multiple Azure resources:

ResourceRoleConnects To
Foundry ProjectCentral workspace for AI developmentAll other resources
Azure AI SearchRetrieval and grounding indexFoundry Project (data connection)
Azure StorageRaw document storage, training dataSearch (indexer source), Foundry
Azure Key VaultSecrets and API keysAll services via managed identity
Azure Container AppsHost custom agent code and orchestratorsFoundry Project (via SDK)
Azure Monitor / App InsightsObservability and tracingAll services
ℹ️ Real-world example: Kai's infrastructure design

Kai is designing the infrastructure for the logistics platform’s AI features:

  • Region: East US 2 (GPT-4o available, closest to main user base)
  • Foundry Project: One project per environment (dev, staging, prod)
  • Model deployment: Serverless for dev (low cost), provisioned for prod (predictable latency)
  • Search: Azure AI Search Standard tier (handles 10,000 shipping documents)
  • Storage: Azure Blob Storage for raw shipment documents
  • Networking: Private endpoints for prod, public for dev
  • Identity: Managed identity everywhere β€” no API keys in code

Key terms

Question

What is serverless model deployment?

Click or press Enter to reveal answer

Answer

A pay-per-token deployment where you only pay for tokens consumed. No reserved capacity. Best for development and variable workloads. May experience throttling during peak times.

Click to flip back

Question

What is provisioned throughput?

Click or press Enter to reveal answer

Answer

Reserved compute capacity measured in Provisioned Throughput Units (PTU). Each PTU delivers a model-specific amount of Tokens Per Minute (TPM). Provides consistent latency and guaranteed capacity. Best for production workloads with predictable traffic.

Click to flip back

Question

What is a Foundry Project in the new architecture?

Click or press Enter to reveal answer

Answer

A standalone Azure resource that serves as the workspace for AI development. Contains model deployments, agent definitions, data connections, and evaluations. No parent hub required in the new Foundry architecture.

Click to flip back

Question

Why does region selection matter for AI solutions?

Click or press Enter to reveal answer

Answer

Regions differ in model availability, data residency compliance, latency to users, and available capacity. Always verify your required models are available in your chosen region before designing infrastructure.

Click to flip back

Knowledge check

Knowledge Check

Atlas Financial is deploying a compliance review agent that processes 100,000 loan applications per month with strict SLA requirements. The workload is predictable and steady. Which deployment option should they choose?

Knowledge Check

NeuralMed must keep all patient data within the European Union due to GDPR requirements. They need GPT-4o for their diagnostic assistant. What should they verify FIRST when choosing an Azure region?

🎬 Video coming soon

← Previous

Retrieval, Indexing & Agent Memory

Next β†’

Deploying Models & CI/CD

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.