Domain 5 — Module 3 of 3 100%

27 of 27 overall

Domain 5: Implement Information Extraction Solutions Free ⏱ ~14 min read

Exam Prep: Putting It All Together

You've covered all five domains. Now let's tie everything together — review the key concepts, test your knowledge across domains, and learn exam strategies for AI-103.

The big picture

Simple explanation

AI-103 tests whether you can build real AI solutions on Azure — not just understand concepts, but make architectural decisions, choose the right services, and implement responsibly.

This module connects the dots between all five domains. In the real world, you don’t solve “Domain 1 problems” and “Domain 2 problems” separately — a single AI solution spans planning, building, monitoring, and responsible AI all at once.

Domain weight recap

Domain	Weight	Key Focus
D1: Plan & Manage	25-30%	Model selection, infrastructure, security, responsible AI
D2: Generative AI & Agents	30-35%	RAG, agents, multi-agent, evaluation, observability
D3: Computer Vision	10-15%	Image/video generation, visual understanding, visual safety
D4: Text Analysis	10-15%	Text extraction, sentiment, speech, translation
D5: Information Extraction	10-15%	Ingestion pipelines, Content Understanding, document extraction

Cross-domain decision map

The exam loves questions that span multiple domains. Here’s how the domains connect:

Scenario	Domains Involved	Key Decision
”Build a chatbot that answers from company docs”	D1 (model), D2 (RAG), D5 (pipeline)	Choose model + search type + chunking strategy
”Agent that processes uploaded invoices”	D2 (agent), D5 (Content Understanding)	Agent tool integration with Content Understanding
”Translate customer calls in real-time”	D4 (speech + translation), D2 (agent workflow)	Speech pipeline + agent modality integration
”Generate marketing images safely”	D3 (image gen), D1 (responsible AI)	Generation controls + content filters + watermarks
”Multi-agent compliance system”	D2 (agents), D1 (security + responsible AI)	Approval gates + RBAC + audit logging

Exam strategy

Exam answer patterns
Feature	Do This	Avoid This
Model selection	Choose the cheapest model that meets requirements	Default to GPT-4o for everything
Foundry Tools vs LLM	Use dedicated tools (Search, Translator, CU) when they exist	Prompt an LLM for tasks with purpose-built tools
Security	Managed identity + private endpoints + RBAC	API keys in code or environment variables
Agent governance	Risk-based: autonomous for low-risk, gated for high-risk	Full autonomy or full advisory (all-or-nothing)
RAG quality	Check retrieval pipeline first when quality drops	Blame the model first
Evaluation	Automated in CI/CD, continuous in production	One-time evaluation before first deployment

The 10 most important concepts

Question

1. When should you use a Foundry Tool vs an LLM?

Click or press Enter to reveal answer

Answer

Foundry Tool (Search, Translator, Content Understanding, Speech) when a purpose-built service exists for the task. LLM when you need general reasoning, creative generation, or tasks without a dedicated service. Foundry Tools are cheaper, faster, and more reliable for their specific tasks.

Click to flip back

Question

2. What is RAG and why does it matter?

Click or press Enter to reveal answer

Answer

Retrieval-Augmented Generation: search your data first, then generate a response grounded in what was found. Reduces hallucinations, keeps responses current, and lets you control what data the model has access to. The dominant pattern for enterprise AI.

Click to flip back

Question

3. What is the difference between an agent and a workflow?

Click or press Enter to reveal answer

Answer

Agent: model decides the steps dynamically, plans, and adapts. Workflow: you define the steps, deterministic control flow. Use agents for flexible, adaptive tasks. Use workflows for predictable, repeatable processing.

Click to flip back

Question

4. How do you secure AI infrastructure?

Click or press Enter to reveal answer

Answer

Three pillars: managed identity (no API keys), private endpoints (no public internet), RBAC (least privilege roles). Always use DefaultAzureCredential for keyless authentication. Grant 'Cognitive Services User' role to apps that only need to call models.

Click to flip back

Question

5. What is the difference between Content Understanding and a multimodal model?

Click or press Enter to reveal answer

Answer

Content Understanding: purpose-built for structured extraction (OCR, layout, fields) from documents. Multimodal model: general-purpose reasoning about visual content. Use CU for extraction tasks, multimodal for reasoning tasks. Use both together for complex workflows.

Click to flip back

Question

6. What is hybrid search and when should you use it?

Click or press Enter to reveal answer

Answer

Hybrid search combines keyword search (BM25) with vector search (embeddings), then applies semantic re-ranking. Use it for most RAG applications — it gives the best balance of precision (exact terms) and recall (semantic meaning).

Click to flip back

Question

7. What is an approval gate in agent governance?

Click or press Enter to reveal answer

Answer

A checkpoint that pauses an agent workflow until a human reviews and approves the proposed action. Used for high-stakes decisions (financial, medical, legal). Low-risk tasks stay autonomous — risk-based governance, not all-or-nothing.

Click to flip back

Question

8. What is model drift and how do you detect it?

Click or press Enter to reveal answer

Answer

When a model's behaviour changes over time without code changes — due to model updates, data shifts, or query pattern changes. Detect it through continuous evaluation monitoring: track groundedness, relevance, and safety scores in production.

Click to flip back

Question

9. What is indirect prompt injection via images?

Click or press Enter to reveal answer

Answer

An attack where malicious instructions are embedded as text within images (visible or invisible). When a multimodal model processes the image, it may follow the injected instructions. Defend with: prompt shields, input validation, and system prompt hardening.

Click to flip back

Question

10. What is the correct error investigation order for agents?

Click or press Enter to reveal answer

Answer

Start from the outside and work inward: (1) Check tool calls — API timeouts, auth failures, wrong params. (2) Check retrieval — stale index, poor relevance. (3) Check model reasoning — last, not first. Most agent failures are tool failures, not model failures.

Click to flip back

Cross-domain knowledge checks

Knowledge Check

A healthcare company needs to: (1) extract patient data from scanned forms, (2) store it in a database, (3) allow a chatbot to answer questions about patient records, and (4) ensure all data stays within the EU. Which services are involved?

Knowledge Check

An enterprise deploys an AI agent that: autonomously answers FAQ questions, generates compliance reports (requires human approval), and flags suspicious transactions (requires immediate alerting). Which governance configuration is correct?

Knowledge Check

A RAG application's quality has degraded. Users report outdated information. No code changes were deployed. In what order should you investigate?

Exam day tips

Read the full question — exam questions often have constraints in the last sentence that change the correct answer
Look for cost signals — if the question mentions budget, cost, or scale, lean toward cheaper/simpler options
Look for security signals — if the question mentions compliance, regulated, or sensitive data, lean toward managed identity + private endpoints
Look for “FIRST” or “BEST” — these qualifiers mean there may be multiple correct options, but one is optimal
Flag and return — don’t spend more than 2 minutes on any question. Flag difficult ones and return after completing easier questions.
700 to pass — you don’t need 100%. Focus on D1 (25-30%) and D2 (30-35%) — they’re 55-65% of the exam.